KLSI Methods for Human Simultaneous Interpretation and Towards Building a Simultaneous Machine Translation System Reflecting the KLSI Methods.

2021 
Simultaneous machine translation aims to maintain translation quality while minimizing the delay between reading input and incrementally producing the output. KL Simultaneous Interpreting (KLSI) is a set of methods developed to deliver non-revisable translation from English into Chinese with one second as the benchmark latency. To achieve this, it trains the human brain to stop thinking and execute commands instead. It has developed a range of formulaic techniques to be applied mechanically. This paper presents some of the key features and techniques of KLSI and explores its implications for machine simultaneous interpreting. The techniques include convergence, the concept of interpreting within three words heard at any given moment in time, co-texting, defaulting, and sequential translation techniques such as repeat, replace, reverse logic, and SAI (Skip, Add, Insert). Multiple English-to-Chinese examples and video recordings are listed in the paper to illustrate the KLSI features and techniques. In the second part of this paper, we describe computational methods related to KLMI techniques: wait-k policy with and without anticipation; word-based and phrase-based alignment and mapping; constrained context for neural machine translation. Commercial machine translation requires at least several gigabytes (or millions of words) of language pair documents for the training. It is not realistic to obtain enough parallel texts in English and Chinese reflecting KLSI techniques for training a neural machine translation system. However, a lot of parallel texts exist that do not reflect KLSI techniques. We propose to use a rule-based approach to modify available parallel texts using KLSI rules to generate a large enough KLSI-based corpus. The main contribution of this paper is to propose a novel rule-based approach―with the rules reflecting human interpretation traits—to revise training corpus to enable short latency and non-revisable machine translation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []