MULTILIT : manual, criteria of transcription and analysis for German, Turkish and English

Christoph Schroeder,Christin Schellhardt,Mehmet-Ali Akinci,Meral Dollnick,Ginesa Dux,Esin Işil Gülbeyaz,Anne Jähnert,Ceren Koç-Gültürk,Patrick Kühmstedt,Florian Kuhn,Verena Mezger,Carol W. Pfaff,Betül Sena Ürkmez

MULTILIT : manual, criteria of transcription and analysis for German, Turkish and English

2015

This paper presents an overview of the linguistic analyses developed in the MULTILIT project and the processing of the oral and written texts collected. The project investigates the language abilities of multilingual children and adolescents, in particular, those who have Turkish and/or Kurdish as a mother tongue. A further aim of the project is to examine from a psycholinguistic and sociolinguistic perspective the extent to which competence in academic registers is achieved on the basis of the languages spoken by the children, including the language(s) spoken at the home, the language of the country of residence and the first foreign language. To be able to examine these questions using corpus linguistic parameters, we created categories of analysis in MULTILIT. The data collection comprises texts from bilingual and monolingual children and adolescents in Germany in their first language Turkish, their second language German und their foreign language English. Pupils aged between nine and twenty years of age produced monologue oral and written texts in the two genres of narrative and discursive. On the basis of these samples, we examine linguistic features such as lexical expression (lexical density, lexical diversity), syntactic complexity (syntactic and discursive packaging) as well as phonology in the oral texts and orthography in the written texts, with the aim of investigating the pupils’ growing mastery of these features in academic and informal registers. To this end the raw data have been transcribed by the use of transcription conventions developed especially for the needs of the MULTILIT data. They are based on the commonly used HIAT and GAT transcription conventions and supplemented with conventions that provide additional information such as features at the graphic level. The categories of analysis comprise a large number of linguistic categories such as word classes, syntax, noun phrase complexity, complex verbal morphology, direct speech and text structures. We also annotate errors and norm deviations at a wide range of levels (orthographic, morphological, lexical, syntactic and textual). In view of the different language systems, these criteria are considered separately for all languages investigated in the project.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations