Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/115270
Title: Contextualised word prediction system
Authors: Bugeja Douglas, Liam (2023)
Keywords: Natural language processing (Computer science)
Speech perception
Lexicology
Neural networks (Computer science)
Issue Date: 2023
Citation: Bugeja Douglas, L. (2023). Contextualised word prediction system (Bachelor's dissertation).
Abstract: In recent years language modeling has become an important concept in natural language processing applications. An area which is extensively researched in natural language pro‐ cessing is word prediction, which is a process that involves suggesting the most probable next word in a given text based on the previous context of the words. This technique is used in many text‐related applications and allows users to save time whilst typing, leading to faster and easier communication between individuals. Whilst state‐of‐the‐art language models have been rapidly improving in word prediction due to model optimisation and better training techniques, these models often struggle to predict the correct word if they are given limited text input. This study aims to investigate the potential improvement in word prediction performance by enriching language models with contextual data, by using image classification and speech recognition. For image classification, four different classification models were evaluated including VGG‐16, VGG‐19, and Inception V3 to predict five indoor classes (bathroom, bedroom, dining room, kitchen, and living room) from a house room image dataset. For speech recognition, Google Cloud Speech‐to‐Text was employed to transcribe spoken words into text. Large language models, including RoBERTa, ELEC‐ TRA, and BERT were then used to evaluate the effectiveness of the image classification and speech recognition by integrating the predicted indoor room and the information obtained from speech transcription before the user input. To evaluate the models a customised multimodal dataset was created with indoor rooms, recorded speech, and text input. To ensure the models were tested on new data, a separate language model was used to generate the text and speech input. The study revealed a noticeable enhancement in word prediction accuracy across all the language models when the additional context is used. Moreover, the system showcased an improvement of 10% in terms of word prediction accuracy, with the speech recognition data giving the most substantial impact.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/115270
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
Contextualised Word Prediction System.pdf3.78 MBAdobe PDFView/Open


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.