ABSTRACT

Alignment intersections obtained from both directions form the starting point for constructing phrase alignments. Translation probabilities of phrases need phrases of both the input and the candidate. So do distortion probabilities, which are functions of reordering of phrases of input, after being translated and placed in the output. Phrase-based statistical machine translation is therefore similar to a translation situation wherein the number of words in the input and the output sentences are equal, except that the words have been reordered. The reason is that given the candidate translation that is being scored, the translation units are all decided and no phrase translation depends on any other phrase or its translation. The scoring of the candidate translation requires that the candidate be available for scoring after all. The language model score is computed from n-grams of the candidate. Translation probabilities of phrases need phrases of both the input and the candidate.