Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/107903
Title: Multilingual low-resource translation for Indo-European languages
Authors: Sant, Jake (2022)
Keywords: Indo-European languages -- Machine translating
Transfer learning (Machine learning)
Neural networks (Computer science)
Issue Date: 2022
Citation: Sant, J. (2022). Multilingual low-resource translation for Indo-European languages (Bachelor's dissertation).
Abstract: Neural machine translation is the task of translating text between a source language and a target language using artificial neural networks. Typically, training neural machine translation models is computationally expensive and requires vast amounts of data in order to build a model which is able to translate accurately enough for use in the real-world. Some of the best translation models in the world such as those developed by Google and Facebook are trained on billions of sentences where data is available. These models provide coverage for over 100 languages, however many of these languages do not have large parallel corpora and this becomes evident from the grammatical and linguistic errors that can be seen in translation outputs. This project focuses on neural machine translation in the context of low-resource languages, which possess far less online corpora available when compared with other widely spoken languages such as English, Spanish and Mandarin. A dataset was constructed from an amalgamation of different multilingual sources in order to create a single larger multilingual corpus which included the English, Danish, German, Icelandic, Norwegian and Swedish languages. A combination of pre-trained neural network as well as novel neural networks were trained on this new dataset, primarily using the technique of transfer learning.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/107903
Appears in Collections:Dissertations - FacICT - 2022
Dissertations - FacICTAI - 2022

Files in This Item:
File Description SizeFormat 
2208ICTICT390905064566_1.PDF
  Restricted Access
1.14 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.