Learning-based Intrasentence Segmentation for Efficient Translation of Long Sentences

Sung Dong Kim,Byoung-Tak Zhang,Yung Taek Kim

Learning-based Intrasentence Segmentation for Efficient Translation of Long Sentences

2001

Long-sentence analysis has been a critical problem in machine translation because of its high complexity. Intrasentence segmentation has been proposed as a method for reducing parsing complexity. This paper presents a two-step segmentation method: (1) identifying potential segmentation positions in a sentence and (2) selecting an actual segmentation position amongst them. We have attempted to apply machine learning techniques to the segmentation task: ``concept learning'' and ``genetic learning''. By learning the ``SegmentablePosition'' concept, the rules for identifying potential segmentation positions are postulated. The selection of the actual segmentation position is based on a function whose parameters are determined by genetic learning. Experimental results are presented which illustrate the effectiveness of our approach to long-sentence parsing for MT. The results also show improved segmentation performance in comparison to other existing methods.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations