Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/120597
Title: Deciphering the sentiment of Maltese citizens during the COVID‐19 pandemic through social media using BERT and its variants
Authors: Gauci, Annabel (2023)
Keywords: COVID-19 Pandemic, 2020-2023 -- Malta
Data sets -- Malta
Sentiment analysis -- Malta
Issue Date: 2023
Citation: Gauci, A. (2023). Deciphering the sentiment of Maltese citizens during the COVID‐19 pandemic through social media using BERT and its variants (Master's dissertation).
Abstract: Information, both factual and non‐factual, is being uploaded and shared at a rapid pace on the World Wide Web. It takes time for a human being to read, evaluate, process and react to information, and it can be difficult for the producer of said information to understand the audience’s reaction to it. This is true for companies that need to figure out how their target market reacts to their brand, for politicians to understand how they are perceived by potential voters and, in closer relation to this study, help governments understand the sentiment of its people during a worldwide pandemic. In this work, we go through several Facebook comments and Twitter posts to decipher the sentiment of Maltese citizens during the start of the COVID‐19 pandemic, during the years 2019 and 2020. During these two years, Malta, and the rest of the world, were starting to understand the COVID‐19 virus and become knowledgeable about its symptoms, effects, and overall impact on human health. By understanding the sentiment of citizens, large entities such as the local government and large corporations can start to understand what the population needs and how to effectively aid in such a troubling time. By performing sentiment analysis and scraping relevant content posted by citizens, instead of asking predetermined questions, a particular element of bias is removed, allowing for deeper insight into the people’s freely voiced concerns or opinions. In our research, we explore five models to analyse sentiment: BERTweet which is an adaptation of Google’s BERT pre‐trained on COVID‐19 Tweets; mBERT (multilingual BERT) which is another adaptation of Google’s BERT pre‐trained on several languages; BERTu which is also an adaptation of BERT pre‐trained on a corpus in Maltese; its multilingual sibling, mBERTu; and also a self‐built LSTM model. All models offer a distinct advantage when it comes to this study, and the results are compared and contrasted between all. All models are first fed the SST2 dataset, which is used as a benchmark dataset given it is already pre‐labeled and in the English language. The English language is one on which all our BERT variants are initially trained, providing a level playing field for all. When it comes to the Facebook comment dataset, including comments in both the Maltese and English language, the performance of mBERT, mBERTu, BERTu and BERTweet is practically on par. This might be due to the fact that this dataset contains around half of its distribution in the English language as the Maltese population likes to mix both languages colloquially. On the automatically translated dataset however, which is a dataset taken from Kaggle, discussing COVID‐19 but auto‐translated into just the Maltese language, we can see that albeit performance is not as good as the Facebook comment dataset, these models will perform similarly. The LSTM however, since it is a self‐built language model does not undergo as much training and pre‐training as the rest of our large state‐of‐the‐art models, but performs respectively well considering these limitations for all our datasets.
Description: M.Sc.(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/120597
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
2319ICTICS520005032594_1.PDF
  Restricted Access
1.88 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.