Learning-based Intrasentence Segmentation for Efficient Translation of Long Sentences

2001 
Long-sentence analysis has been a critical problem in machine translation because of its high complexity. Intrasentence segmentation has been proposed as a method for reducing parsing complexity. This paper presents a two-step segmentation method: (1) identifying potential segmentation positions in a sentence and (2) selecting an actual segmentation position amongst them. We have attempted to apply machine learning techniques to the segmentation task: ``concept learning'' and ``genetic learning''. By learning the ``SegmentablePosition'' concept, the rules for identifying potential segmentation positions are postulated. The selection of the actual segmentation position is based on a function whose parameters are determined by genetic learning. Experimental results are presented which illustrate the effectiveness of our approach to long-sentence parsing for MT. The results also show improved segmentation performance in comparison to other existing methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    11
    Citations
    NaN
    KQI
    []