Making sense of discourse: On discourse segmentation and the linguistic marking of coherence relations

2018 
To fully understand a discourse, it is essential not only to know the meaning of each individual clause, but also to figure out how all clauses are related to each other. If all goes well, language users end up with an accurate representation of the discourse. An important aspect of building a mental representation of a discourse is inferring coherence relations between discourse segments. When inferring coherence relations, language users have to deduce whether two or more chunks of text constitute, for instance, a cause-consequence relation, a rule and an exception, alternatives, etc. Coherence relations can be explicitly marked by a connective(e.g., because, but) or a cue phrase(e.g., more specifically, by comparison), but this need not be the case; in many instances, coherence relations have to be established without the instructions connectives provide or, in the case of underspecified connectives, with only limited instructions. This dissertation investigates both discourse segmentation and the linguistic marking of coherence relations, using a combination of theoretical exploration, qualitative and quantitative corpus studies, and experimental methods. Chapters 2 and 6 address the question between which parts of a text language users infer coherence relations. Using corpus examples, Chapter 2 explores which parts of a text make up the discourse segments (or idea units) that are related to each other in the mental representation of a text. It is argued that, while elements that make up the propositional content of a text are always part of a discourse segment, other linguistic elements (e.g., stance markers) can, but need not be part of the discourse segments. In a series of psycholinguistic experiments, Chapter 6 demonstrates that language users can infer coherence relations between restrictive relative clauses and their matrix clauses. This suggests that restrictive relative clauses should not be categorically excluded as discourse segments, as has been the case in the majority of discourse segmentation guidelines. The linguistic marking of coherence relations is explored in Chapters 4 and 5. The parallel corpus study in Chapter 4 suggests that the linguistic marking of coherence relations is influenced by cognitive complexity, with complex relations (such as conditional relations or concessions) being more often explicitly signaled than simple relations (such as cause-consequence or additive relations). Chapter 5 uses parallel corpus data to argue that linguistic elements other than connectives can also function as signal for coherence relations, and that the presence of signals inside the discourse segments can influence whether the relation is marked by a connective. This dissertation provides new insights into how and why coherence relations are explicitly marked and between which parts of a text people infer coherence relations.In addition, it contributes to refining discourse segmentation (Chapter 2) and annotation (Chapter 3) guidelines, both of which are important methodological tools in research on discourse coherence. Finally, by using parallel corpus data, it helps map out how coherence relations are translated by human translators, an important preliminary step toward improving the quality of machine translation output at the discourse level.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    2
    Citations
    NaN
    KQI
    []