The CommonLit Ease of Readability (CLEAR) Corpus

2021 
In this paper, we introduce the Anonymous Ease of Readability (AEAR) corpus. The corpus provides researchers within the educational data mining community with a resource from which to develop and test readability metrics and to model text readability. The AEAR corpus has a number of improvements over previous readability corpora include size (N = ~5,000 reading excerpts), the breadth of the excerpts available, which cover over 250 years of writing in two different genres, and the readability criterion used (teachers’ ratings of text difficulty for their students). This paper discusses the development of the corpus and presents reliability metrics as well as initial analyses of readability.
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []