코퍼스 기반 영화 영어 어휘 목록 개발

2020 
The purpose of this paper is to develop a list of English words that are frequently used in film subtitles and film dialogue. For this purpose, two corpora from each register were compiled: English Subtitles for Korean Film Corpus and English Dialogue for American Film Corpus, each of which has about a total of 1.2 million running words from three different genres. Similarities and differences of the two word lists were compared. Because of their similarities, the two corpora were combined and compiled to an integrated corpus: Film English Corpus (FEC) of 2.4 million tokens. The words for the Film English Word List (FEWL) were chosen based off of three criteria: 1) in terms of the cut-off frequency, words must occur more than 100 times, 2) as for the range, words should occur in all of six genres, 3) only content words should be included. A total of 1,292 word families were identified as the FEWL. Word families in the FEWL were compared with those in the General Service List (GSL) of West (1954) and the Academic Word List (AWL) of Coxhead (2000) in terms of frequency. It was found that frequently occurring words in the FEWL did not necessarily frequently occur in the GSL and/or the AWL. Thus, the FEWL comprises words according to the frequency of the FEC, rather than separating the GSL and the AWL.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []