A Diachronic Italian Corpus based on “L’Unità”
2020
In this paper, we describe the creation of a diachronic corpus for Italian by exploiting the digital archive of the newspaper “L’Unit`a”. We automatically clean and annotate the corpus with PoStags, lemmas, named entities and syntactic dependencies. Moreover, we compute frequency-based time series for tokens, lemmas and entities. We show some interesting corpus statistics taking into account the temporal dimension and describe some examples of usage of time series.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
2
References
1
Citations
NaN
KQI