Please use this identifier to cite or link to this item:
Title: Phrase extraction for machine translation
Authors: Rosner, Michael
Bajada, Jo-Ann
Keywords: Translators (Computer programs)
Corpora (Linguistics)
Machine translating
Issue Date: 2007
Publisher: University of Malta. Faculty of ICT
Citation: Bajada, J., & Rosner, M. (2007). Phrase extraction for machine translation. 5th Computer Science Annual Workshop (CSAW’07), Msida. 226-233.
Abstract: Statistical Machine Translation (SMT) developed in the late 1980s, based initially upon a word-to-word translation process. However, such processes have difficulties when good quality translation is not strictly word-to-word. Easy cases can be handled by allowing insertion and deletion of single words, but for more general word reordering phenomena, a more general translation process is required. There is currently much interest in phrase-to-phrase models, which can overcome this problem, but require that candidate phrases, together with their translations, be identified in the training corpora. Since phrase delimiters are not explicit, this gives rise to a new problem; that of phrase pair extraction. The current project proposes a phrase extraction algorithm which uses a window of n words around source and target words to extract equivalent phrases. The extracted phrases together with their probabilities, are used as input to an existing Machine Translation system for the purpose of evaluating the phrase extraction algorithm.
Appears in Collections:Scholarly Works - FacICTCS

Files in This Item:
File Description SizeFormat 
Proceedings of CSAW’07 - A22.pdf200.91 kBAdobe PDFView/Open

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.