Capturing SQL Query Overlapping via Subtree Copy for Cross-Domain Context-Dependent SQL Generation

2021 
The key challenge of cross-domain context-dependent text-to-SQL generation tasks lies in capturing the relation of natural language utterance and SQL queries in different turns. A line of works attempt to combat this challenge by capturing the overlaps among consecutively generated SQL queries. Existing models sequentially generate the SQL query for a single turn and model the SQL overlaps via copying tokens or segments generated in previous turns. However, they are not flexible enough to capture various overlapping granularities, e.g., columns, filters, or even the whole query, as they neglect the intrinsic structures inhabited in SQL queries. In this paper, we employ tree-structured intermediate representations of SQL queries, i.e., SemQL, for SQL generation and propose a novel subtree-copy mechanism to characterize the SQL overlaps. At each turn, we encode the interaction questions and previously generated trees as context and decode the SemQL tree in a top-down fashion. Each node is either generated according to SemQL grammar or copied from previously generated SemQL subtrees. Our model can capture various overlapping granularities by copying nodes at different levels of SemQL trees. We evaluate our approach on the SParC dataset and the experimental results show the superior performance of our model compared with state-of-the-art baselines.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    0
    Citations
    NaN
    KQI
    []