Improved Statistical Machine Translation by Cross-Linguistic Projection of Named Entities Recognition and Translation

Rahma Sellami, Fatima Deffaf, Fatiha Sadat, Lamia Hadrich Belguith


One of the existing difficulties in natural languageprocessing applications is the lack of appropriatetools for the recognition, translation, and/or transliterationof named entities (NEs), specifically for lessresourcedlanguages. In this paper, we propose a newmethod to automatically label multilingual parallel datafor Arabic-French pair of languages with named entitytags and build lexicons of those named entities with theirtransliteration and/or translation in the target language.For this purpose, we bring in a third well-resourcedlanguage, English, that might serve as pivot, in orderto build an Arabic-French NE Translation lexicon. Evaluationson the Arabic-French pair of languages usingEnglish as pivot in the transitive model showed the effectivenessof the proposed method for mining Arabic-French named entities and their translations. Moreover,the integration of this component in statistical machinetranslation outperformed the baseline system.


Named entity, pivot language, machine translation.

Full Text: PDF