Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/108382
Title: Error-checking for Maltese : deep learning for a low-resource scenario
Authors: Debattista, Aaron (2022)
Keywords: Maltese language -- Grammar -- Software
Deep learning (Machine learning)
Issue Date: 2022
Citation: Debattista, A. (2022). Error-checking for Maltese: deep learning for a low-resource scenario (Master's dissertation).
Abstract: In this study, we deep dive into the topic of resource-scarce Grammar Error Correction (GEC). Already a vast subject, GEC adds new layers of complexity when applied to low-resource environments. We explore the state-of-the-art and how it stemmed from rudimentary rule-based systems, transitioned into statistical models, and eventually evolved into neural-based techniques. The deep learning solutions applied in GEC are encapsulated by Neural Machine Translation. We experimented with Encoder-Decoder models, focusing on the Sequence-to Sequence and transformer architectural paradigms. We implemented a solution based on these paradigms and established three baseline models: a Seq2Seq model, a Vaswani-style transformer (Vaswani et al., 2017) and a Nematus-style transformer (Sennrich et al., 2017). We then applied several literature-backed adaptations to improve its performance in a low-resource environment. During experimentation, we observed progressive improvement upon applying tied embeddings and pretrained models for transfer learning. We evaluated the model against the Maltese language to observe the solution’s performance when applied to a real-world low-resource language. The final model achieved a F0.5 score of 31.84%. The score is reminiscent of other GEC solutions, particularly those submitted for the BEA-2019 shared task. However, the implications of this result need to be analysed critically. The other submitted systems in BEA-2019 had been trained on different datasets and under different conditions. Therefore, the score of 31.84% is not a conclusive indication that our solution is better/worse than the alternatives that were submitted during the shared task. Nevertheless, we believe that the findings of this study are essential because Maltese still lags in terms of Grammar Error Correction and new innovative ways of bypassing the data shortages are required to push Maltese GEC to the next level.
Description: M.Sc.(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/108382
Appears in Collections:Dissertations - FacICT - 2022
Dissertations - FacICTAI - 2022

Files in This Item:
File Description SizeFormat 
2219ICTICS520005031572_1.PDF2.18 MBAdobe PDFView/Open


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.