Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/22780
Title: | Phrase extraction for machine translation |
Authors: | Rosner, Michael Bajada, Jo-Ann |
Keywords: | Translators (Computer programs) Bilingualism Corpora (Linguistics) Machine translating |
Issue Date: | 2007 |
Publisher: | University of Malta. Faculty of ICT |
Citation: | Bajada, J., & Rosner, M. (2007). Phrase extraction for machine translation. 5th Computer Science Annual Workshop (CSAW’07), Msida. 226-233. |
Abstract: | Statistical Machine Translation (SMT) developed in the late 1980s, based initially upon a word-to-word translation process. However, such processes have difficulties when good quality translation is not strictly word-to-word. Easy cases can be handled by allowing insertion and deletion of single words, but for more general word reordering phenomena, a more general translation process is required. There is currently much interest in phrase-to-phrase models, which can overcome this problem, but require that candidate phrases, together with their translations, be identified in the training corpora. Since phrase delimiters are not explicit, this gives rise to a new problem; that of phrase pair extraction. The current project proposes a phrase extraction algorithm which uses a window of n words around source and target words to extract equivalent phrases. The extracted phrases together with their probabilities, are used as input to an existing Machine Translation system for the purpose of evaluating the phrase extraction algorithm. |
URI: | https://www.um.edu.mt/library/oar//handle/123456789/22780 |
Appears in Collections: | Scholarly Works - FacICTCS |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Proceedings of CSAW’07 - A22.pdf | 200.91 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.