ABSTRACT

The aim of segmentation is to partition a text into topically coherent parts. The result is a structure that resembles the table of contents of a book, where each chapter and section focuses on a specific topic within a story. This technology supports machine translation by limiting the size, scope and context of the input text. It improves translation speed by reducing the size of the input text from a complete story to a series of shorter independent text segments, thus reducing the search space and the number of candidate translations for selection. Text segments can be processed in parallel to boost speed performance in practical applications. The technology also improves translation accuracy by reducing the level of ambiguity (e.g. river bank, world bank) and the range of references (e.g. he, she, the president) in the input text, thus enabling the translation process to generate the most appropriate and specific output for the local context.