Dependency-based syntax-aware word representations

2020 
Abstract Dependency syntax has been demonstrated highly useful for a number of natural language processing (NLP) tasks. Typical approaches of utilizing dependency syntax include Tree-RNN and Tree-Linearization, both of which exploit explicit 1-best tree outputs from a well-trained parser as inputs. However, these approaches may suffer from error propagation due to the inevitable errors contained in the 1-best tree outputs. In this work, we propose a novel approach to integrate dependency syntax without using the discrete tree outputs. The key idea is to use the intermediate hidden representations of a well-trained encoder-decoder dependency parser, which are referred to as Dep endency-based S yntax- A ware W ord R epresentations (Dep-SAWRs). Then, we simply concatenate such Dep-SAWRs with the conventional context-insensitive word embeddings to compose input word representations, without requiring to modify the model architecture of the downstream tasks. We evaluate the proposed method on four kinds of typical NLP tasks, including sentence classification, sentence matching, sequence labeling and machine translation. Experimental results show that the proposed approach is highly promising. On the one hand, it can utilize dependency syntax effectively, bringing consistently better performance on the four tasks compared with baselines without using syntax. On the other hand, the proposed method can outperform the Tree-RNN and Tree-Linearization approaches in most settings, and meanwhile are highly efficient in syntax integration. In addition, the proposed method would be easily extendable to encoding other structural attributes of language.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    177
    References
    3
    Citations
    NaN
    KQI
    []