Annotated Clause Boundaries’ Influence on Parsing Results

2018 
The aim of the paper is to study the effect of pre-annotated clause boundaries on dependency parsing of Estonian new media texts. Our hypothesis is that correct identification of clause boundaries helps to improve parsing because as the text is split into smaller syntactically meaningful units, it should be easier for the parser to determine the syntactic structure of a given unit. To test the hypothesis, we performed two experiments on a 14,000-word corpus of Estonian web texts whose morphological analysis had been manually validated. In the first experiment, the corpus with gold standard morphological tags was parsed with MaltParser both with and without the manually annotated clause boundaries. In the second experiment, only the segmentation of the text was preserved and the morphological analysis was done automatically before parsing. The experiments confirmed our hypothesis about the influence of correct clause boundaries by a small margin: in both experiments, the improvement of LAS was 0.6%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    1
    Citations
    NaN
    KQI
    []