State gradients for analyzing memory in LSTM language models

2020 
Abstract Gradients can be used to train neural networks, but they can also be used to interpret them. We investigate how well the inputs of RNNs are remembered by their state by calculating ‘state gradients’, and applying SVD on the gradient matrix to reveal which directions in embedding space are remembered and to what extent. Our method can be applied to any RNN and reveals which properties in the embedding space influence the state space, without the need to know and label the properties beforehand. In this paper, we propose a normalization method that alleviates the influence of variance in embedding space on the state gradients and show the effectiveness of our method on a synthetic dataset. Additionally, the influence of several training settings on the RNN memory is investigated, and a more fine-grained analysis based on POS and word types shows that LSTM language models learn to model linguistic intuitions. Our code and datasets are publicly available.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    86
    References
    4
    Citations
    NaN
    KQI
    []