Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/63040
Title: Teach me how to feel : learning to classify emotions from short texts
Authors: Balzan, Janice
Keywords: Natural language processing (Computer science)
Language and emotions
Microblogs
Issue Date: 2020
Citation: Balzan, J. (2020). Teach me how to feel: learning to classify emotions from short texts (Bachelor's dissertation).
Abstract: Emotion classification is a very useful process in natural language processing. It provides readers with more pragmatic information about the text that they are reading, but due to the lack of multimodal information, this can be quite hard todo computationally. The SemEval shared task of 2018, called Affect in Tweets, provided users withtweets with labels indicating which emotions they expressed. Eleven different emotions were used as classes, namely Anger, Anticipation, Disgust, Fear, Joy, Love,Optimism, Pessimism, Sadness, Surprise and Trust. In this dissertation, four different machine learning algorithms were tested to try and find the best approach for emotion classification in tweets. These were Naive Bayes, Logistic Regression, Support Vector Machines and a Recurrent Neural Network. Moreover, for the NaiveBayes and deep learning model, two different types of classes were used. Firstly, all eleven emotions were concatenated together and a binary string was created, representing the combination of all the emotions. Secondly, they were classified separately as individual classes and the model was trained on each emotion on its own. For the other two machine learning algorithms, only the latter strategy was used. Additionally, seven types of features and combinations were extracted from the tweets. These were: The tweets with a pre-processing procedure applied to them ; The subjectivity of the tweets ; The polarity of the tweets ; The tweets and their subjectivity ; The tweets and their polarity ; The values of subjectivity and polarity of the tweets ; The tweets and their subjectivity and polarity concatenated together. Upon evaluation, it turned out that the Logistic Regression model with the combination of Tweets, Subjectivity and Polarity used as features was the one that performed best. Furthermore, the correlation between emotions was analysed and it was found out that Anger and Disgust were the two most correlated emotions. The results obtained were carefully analysed, and conclusions were drawn about the feasibility of classifying emotions from short social media texts.
Description: B.SC.(HONS)HUMAN LANGUAGE TECH.
URI: https://www.um.edu.mt/library/oar/handle/123456789/63040
Appears in Collections:Dissertations - InsLin - 2020

Files in This Item:
File Description SizeFormat 
20BSCHLT002.pdf
  Restricted Access
3.09 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.