Employing sentiment analysis to classify text

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/135070

Title:	Employing sentiment analysis to classify text
Authors:	Borg, Gabriel (2024)
Keywords:	Machine learning Natural language processing (Computer science) Artificial Intelligence Neural networks (Computer science)
Issue Date:	2024
Citation:	Borg, G. (2024). Employing sentiment analysis to classify text (Master’s dissertation).
Abstract:	In this research, we chose an investigative approach to explore the effectiveness of various Machine Learning classification models and Natural Language Processing techniques in the domains of text classification and sentiment analysis. We began with a comprehensive literature review and examined prior studies to understand prevalent methodologies and outcomes achieved in this field. This research guided our selection of eight prominent classification algorithms: The first four being classical classification algorithms, consisting of the Decision Tree Classifier, Support Vector Machine, Logistic Regression, and Naïve Bayes Classifier. The remaining four algorithms were neural network-based models and consisted of the Long Short Term Memory model, the Artificial Neural Network model, the Recurrent Neural Network and the Multi Layer Perceptron models. Our primary objective was to identify the most suitable classification technique among these chosen algorithms for accurately categorizing text into three distinct semantic classes: Positive sentiment, Neutral Sentiment and Negative sentiment, solely based on the textual content. We conducted this study within the context of two distinct text datasets, each possessing unique linguistic characteristics. Prior to applying these Machine Learning (ML) models, a rigorous preparation and pre-processing phase was undertaken to ensure the datasets were ready for training. This process involved cleaning, normalizing, tokenizing, stemming, and lemmatization to format the raw text so that it could be effectively analyzed by the ML algorithms without changing the overall text content of the dataset. These pre-processing steps aimed to extract meaningful features from the text corpus, eliminate irrelevant noise and ensure consistency across the data. We then trained each classification algorithm on both datasets. This training phase enabled us to evaluate the performance of each algorithm under various conditions, including execution time, Central Processing Unit (CPU), and Random Access Memory (RAM) demands. To objectively assess the performance of each algorithm, we employed performance metrics, such as accuracy, recall, precision and F1-score, which have been used in the literature. These metrics provided insights into the ability of each algorithm to correctly classify text into their respective semantic categories. The insights gained from this comprehensive study provide invaluable information for selecting the appropriate techniques and optimizing their performance in text classification and semantic analysis problems. The comparative analysis of all the algorithms presented and implemented offers a valuable framework for researchers seeking to effectively categorize text based on sentiment.
Description:	M.Sc.(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/135070
Appears in Collections:	Dissertations - FacICT - 2024 Dissertations - FacICTAI - 2024

Files in This Item:

File	Description	Size	Format
2519ICTICS520005071976_1.PDF Restricted Access		5.96 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics