Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/130108
Title: BERT for sentiment analysis of Japanese Twitter
Authors: Klein, Jordan W. (2024)
Keywords: Sentiment analysis
Social media
Japanese language -- Discourse analysis
Issue Date: 2024
Citation: Klein, J. W. (2024). BERT for sentiment analysis of Japanese Twitter (Master's dissertation).
Abstract: This publication introduces novel, open-source resources for sentiment analysis on Japanese Twitter. BERT for Japanese Twitter is a pre-trained model that is highly competent in the target domain and adaptable to a variety of tasks. Japanese Twitter Sentiment 1k (JTS1k) is a compact sentiment analysis dataset optimized for balance and reliability. This combination of pre-trained model and dataset was used to fine-tune a sentiment analysis model that broadly applies to Japanese social networking services (SNS): BERT for Japanese SNS Sentiment. The primary focus of this project is domain adaptation. Using an established Japanese BERT model as a foundation, domain adaptation was achieved by optimizing the vocabulary and continuing pre-training on a large Twitter corpus. Similar methodology was used to develop Twitter Multilingual RoBERTa (XLM-T) (Barbieri et al., 2022), which is the state-of-the-art multilingual Twitter model. By using a monolingual approach, this study developed a more efficient model that outperformed XLM-T in the target language. This project explored fundamental elements of corpus construction, corpus refinement, dataset annotation, preprocessing, pre-training, fine-tuning, and benchmarking. It concludes with a demonstration that the sentiment model is valid, useful, and sensitive to changes in public sentiment that correlate with real-world events.
Description: M.A.(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/130108
Appears in Collections:Dissertations - InsLin - 2024

Files in This Item:
File Description SizeFormat 
2419LLTLIN500105075820_1.PDF16.58 MBAdobe PDFView/Open


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.