Phrase extraction for machine translation

Rosner, Michael; Bajada, Jo-Ann

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/22780

Title:	Phrase extraction for machine translation
Authors:	Rosner, Michael Bajada, Jo-Ann
Keywords:	Translators (Computer programs) Bilingualism Corpora (Linguistics) Machine translating
Issue Date:	2007
Publisher:	University of Malta. Faculty of ICT
Citation:	Bajada, J., & Rosner, M. (2007). Phrase extraction for machine translation. 5th Computer Science Annual Workshop (CSAW’07), Msida. 226-233.
Abstract:	Statistical Machine Translation (SMT) developed in the late 1980s, based initially upon a word-to-word translation process. However, such processes have difficulties when good quality translation is not strictly word-to-word. Easy cases can be handled by allowing insertion and deletion of single words, but for more general word reordering phenomena, a more general translation process is required. There is currently much interest in phrase-to-phrase models, which can overcome this problem, but require that candidate phrases, together with their translations, be identified in the training corpora. Since phrase delimiters are not explicit, this gives rise to a new problem; that of phrase pair extraction. The current project proposes a phrase extraction algorithm which uses a window of n words around source and target words to extract equivalent phrases. The extracted phrases together with their probabilities, are used as input to an existing Machine Translation system for the purpose of evaluating the phrase extraction algorithm.
URI:	https://www.um.edu.mt/library/oar//handle/123456789/22780
Appears in Collections:	Scholarly Works - FacICTCS

Files in This Item:

File	Description	Size	Format
Proceedings of CSAW’07 - A22.pdf		200.91 kB	Adobe PDF	View/Open

Show full item record Statistics