Please use this identifier to cite or link to this item:
Title: Extracting semantically similar words : building a distributional semantic model for Maltese
Authors: Camilleri, Christabelle
Keywords: Maltese language -- Semantics
Maltese language -- Morphology
Maltese language -- Synonyms and antonyms
Issue Date: 2018
Citation: Camilleri, C. (2018). Extracting semantically similar words : building a distributional semantic model for Maltese (Bachelor's dissertation).
Abstract: This dissertation describes the process of building a Distributional Semantic Model (DSM) for Maltese, with the aim of generating semantically similar words for a large list of target words automatically. Input to the system is a sub-corpus of the MLRS corpus, which consists of "news" article. A program is created to extract the neighbouring words as well as their frequencies from this corpus, extracting the top 110,000 context words and the top 15000 nouns from those context words. This model uses the bag-of-words and selects contexts from a window of one word proceeding and one word following the target word. To evaluate the quality of the DSM, a well-known evaluation data set for English, the WordSim-353 dataset was translated from English to Maltese and similarity scores were given by 8 participants. The DSM is evaluated using the Spearman and Pearson tests to see how accurate it is compared to the human judgements when it comes to generating the semantic similarity between two words. Results show that the performance of the DSM lies within the range of performances obtained with DSMs for the English language using different setups. Its performance is, admittedly, closer to the lower end of the scale. The rich morphology of the Maltese language, the nature of the evaluation set, and the fact that we only experimented with a limited amount of settings are factors that probably played a major role in this outcome.
Appears in Collections:Dissertations - InsLin - 2018

Files in This Item:
File Description SizeFormat 
  Restricted Access
1.84 MBAdobe PDFView/Open Request a copy

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.