ABSTRACT

This chapter proposes Hindi to English transliteration by considering the origin of named entity in source language. It provides Statistical Machine Translation-based Devanagari-to-Roman transliteration by separately handling two disjoint data sets for two origins, two language models and separate training for each language. Two origins are: Indo-Aryan-Hindi (IAH) and the other is Indo-Aryan-Urdu (IAU). The chapter focuses on the issue and a solution is provided by creating two separate data sets, different language models and different trainings. Syllabification is followed by constructing two separate language models for IAH and IAU. The language model is followed by the transliteration model in which the mapping alignments of syllable sequences between Hindi and English named entities (NE) are carried out. Syllabification and alignment of named entities is the process of partitioning of NEs of source and target languages into transliteration units by keeping their phonetic mappings intact.