Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/125231| Title: | COMET for low-resource machine translation evaluation : a case study of English-Maltese and Spanish-Basque |
| Authors: | Falcão, Júlia Borg, Claudia Aranberri, Nora Abela, Kurt |
| Keywords: | Natural language processing (Computer science) Computational linguistics Translating and interpreting Transliteration |
| Issue Date: | 2024-05 |
| Publisher: | ELRA and ICCL |
| Citation: | Falcão, J., Borg, C., Aranberri, N., & Abela, K. (2024). COMET for low-resource machine translation evaluation : a case study of English-Maltese and Spanish-Basque. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 3553–3565, Torino, Italia. ELRA and ICCL. |
| Abstract: | Trainable metrics for machine translation evaluation have been scoring the highest correlations with human judgements in the latest meta-evaluations, outperforming traditional lexical overlap metrics such as BLEU, which is still widely used despite its well-known shortcomings. In this work we look at COMET, a prominent neural evaluation system proposed in 2020, to analyze the extent of its language support restrictions, and to investigate strategies to extend this support to new, under-resourced languages. Our case study focuses on English-Maltese and Spanish-Basque. We run a crowd-based evaluation campaign to collect direct assessments and use the annotated dataset to evaluate COMET-22, further fine-tune it, and to train COMET models from scratch for the two language pairs. Our analysis suggests that COMET’s performance can be improved with fine-tuning, and that COMET can be highly susceptible to the distribution of scores in the training data, which especially impacts low-resource scenarios. |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/125231 |
| Appears in Collections: | Scholarly Works - FacICTAI |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2024.lrec-main.315_COMET.pdf | 512.29 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
