The Icelandic Contemporary Treebank (IceConTree)

2020 
The Icelandic Contemporary Corpus (IceConTree) is a machine-parsed treebank parsed according to the IcePaHC annotation scheme. It consists of texts from the Icelandic Gigaword Corpus, parsed using the IceNeuralParsingPipeline. It contains 524,601,329 words in 29,929,132 clauses. The treebank consists of 14 texts which are mainly media, law and parliamentary text. Within each text, files are divided according to years. This division was done after the text was parsed and is therefore not completely correct. The texts are labelled according to the following genres: • par: parliamentary text • spe: speech • law: law text • med: text from media • rad: text from radio • onl: text from the Internet • tel: text from television • enc: encyclopedia
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []