Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/120592
Title: Short‐term stock market price prediction based on social media and news sentiment
Authors: Xuereb, Owen (2023)
Keywords: Stock price forecasting
Natural language processing (Computer science)
Social media
Journalism, Commercial
Sentiment analysis
Issue Date: 2023
Citation: Xuereb, O. (2023). Short‐term stock market price prediction based on social media and news sentiment (Master's dissertation).
Abstract: Stock market price prediction is one of the most daunting tasks in finance. The growth of social media and news availability over the internet has added a rapidly available external source of information related to different companies and their stocks, allowing investors to make sentiment‐based, informed decisions. Modern literature has shown significant advancements in Natural Language Processing (NLP) models with the introduction of BERT and its variants, although the application of this model is limited in financial research, and, where used, the sentiment output was not applied to forecast stock prices. In this study, we proposed a four‐stage pipeline with six objectives to answer our main research question: “How can sentiment analysis from news and social media contribute as a feature to short‐term financial price trajectory predictions?”. In the first stage, we identified available financial datasets and created our own to further pre‐train multiple large language models using Masked Language Modelling (MLM) and then fine‐tune them for financial Sentiment Analysis (SA) based on BERT, RoBERTa, and ALBERT. In stage three, we fine‐tuned these models for financial SA and validated them on the Financial PhraseBank dataset used by the previous state‐of‐the‐art financial large language model FinBERT. We applied several pre‐training and fine‐tuning strategies to improve performance, reduce computing resources, and decrease training time. We also evaluated several layer‐based strategies to reduce catastrophic forgetting, such as Gradual Unfreeze (GU), Slanted Triangular Learning Rate (STL), and Discriminate Fine‐Tuning (DFT). Using our best SA model, we generated several date‐based sentiment classification datasets containing tweets and news articles for numerous stocks from Twitter and CNBC News. These datasets were then cross‐applied to their respective stock ticker’s daily price movement, where we trained multiple Neural Network (NN) and Temporal Fusion Transformer (TFT) to predict day‐to‐day closing prices using varying features and architectures. We noted that incorporating sentiment values into day‐to‐day price trajectory classification and time series forecasting outperforms the use of historical price metrics alone. When we combined sentiment with yesterday’s price, our neural networks produced results for some stocks that exceeded 74% accuracy in predicting the current‐day’s closing price.
Description: M.Sc.(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/120592
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
2419ICTICS520005075734_1.PDF10.35 MBAdobe PDFView/Open


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.