Assisting Text Localization and Transcreation Tasks Using AI-Based Masked Language Modeling

2021 
Localization refers to the adaptation of a document’s content to meet the linguistic, cultural, and other requirements of a specific target market―a locale. Transcreation describes the process of adapting a message from one language to another, while maintaining its intent, style, tone, and context. In recent years, pre-trained language models have pushed the limits of natural language understanding and generation and dominated the NLP progress. We foresee that the AI-based pre-trained language models (e.g. masked language modeling) and other existing and upcoming language modeling techniques will be integrated as effective tools to support localization/transcreation efforts in the coming years. To support localization/transcreation tasks, we use AI-based Masked Language Modeling (MLM) to provide a powerful human-machine teaming tool to query language models for the most proper words/phrases to reflect the proper linguistical and cultural characteristics of the target language. For linguistic applications, we list examples on logical connectives, pronouns and antecedents, and unnecessary redundant nouns and verbs. For intercultural conceptualization applications, we list examples of cultural event schema, role schema, emotional schema, and propositional schema. There are two possible approaches to determine where to put masks: a human-based approach or an algorithm-based approach. For the algorithm-based approach, constituency parsing can be used to break a text into sub-phrases, or constituents, after which typical linguistic patterns can be detected and then finally masking tasks can be attempted on the related texts.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    2
    References
    0
    Citations
    NaN
    KQI
    []