Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/135070| Title: | Employing sentiment analysis to classify text |
| Authors: | Borg, Gabriel (2024) |
| Keywords: | Machine learning Natural language processing (Computer science) Artificial Intelligence Neural networks (Computer science) |
| Issue Date: | 2024 |
| Citation: | Borg, G. (2024). Employing sentiment analysis to classify text (Master’s dissertation). |
| Abstract: | In this research, we chose an investigative approach to explore the effectiveness of various Machine Learning classification models and Natural Language Processing techniques in the domains of text classification and sentiment analysis. We began with a comprehensive literature review and examined prior studies to understand prevalent methodologies and outcomes achieved in this field. This research guided our selection of eight prominent classification algorithms: The first four being classical classification algorithms, consisting of the Decision Tree Classifier, Support Vector Machine, Logistic Regression, and Naïve Bayes Classifier. The remaining four algorithms were neural network-based models and consisted of the Long Short Term Memory model, the Artificial Neural Network model, the Recurrent Neural Network and the Multi Layer Perceptron models. Our primary objective was to identify the most suitable classification technique among these chosen algorithms for accurately categorizing text into three distinct semantic classes: Positive sentiment, Neutral Sentiment and Negative sentiment, solely based on the textual content. We conducted this study within the context of two distinct text datasets, each possessing unique linguistic characteristics. Prior to applying these Machine Learning (ML) models, a rigorous preparation and pre-processing phase was undertaken to ensure the datasets were ready for training. This process involved cleaning, normalizing, tokenizing, stemming, and lemmatization to format the raw text so that it could be effectively analyzed by the ML algorithms without changing the overall text content of the dataset. These pre-processing steps aimed to extract meaningful features from the text corpus, eliminate irrelevant noise and ensure consistency across the data. We then trained each classification algorithm on both datasets. This training phase enabled us to evaluate the performance of each algorithm under various conditions, including execution time, Central Processing Unit (CPU), and Random Access Memory (RAM) demands. To objectively assess the performance of each algorithm, we employed performance metrics, such as accuracy, recall, precision and F1-score, which have been used in the literature. These metrics provided insights into the ability of each algorithm to correctly classify text into their respective semantic categories. The insights gained from this comprehensive study provide invaluable information for selecting the appropriate techniques and optimizing their performance in text classification and semantic analysis problems. The comparative analysis of all the algorithms presented and implemented offers a valuable framework for researchers seeking to effectively categorize text based on sentiment. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/135070 |
| Appears in Collections: | Dissertations - FacICT - 2024 Dissertations - FacICTAI - 2024 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2519ICTICS520005071976_1.PDF Restricted Access | 5.96 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
