Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/130108| Title: | BERT for sentiment analysis of Japanese Twitter |
| Authors: | Klein, Jordan W. (2024) |
| Keywords: | Sentiment analysis Social media Japanese language -- Discourse analysis |
| Issue Date: | 2024 |
| Citation: | Klein, J. W. (2024). BERT for sentiment analysis of Japanese Twitter (Master's dissertation). |
| Abstract: | This publication introduces novel, open-source resources for sentiment analysis on Japanese Twitter. BERT for Japanese Twitter is a pre-trained model that is highly competent in the target domain and adaptable to a variety of tasks. Japanese Twitter Sentiment 1k (JTS1k) is a compact sentiment analysis dataset optimized for balance and reliability. This combination of pre-trained model and dataset was used to fine-tune a sentiment analysis model that broadly applies to Japanese social networking services (SNS): BERT for Japanese SNS Sentiment. The primary focus of this project is domain adaptation. Using an established Japanese BERT model as a foundation, domain adaptation was achieved by optimizing the vocabulary and continuing pre-training on a large Twitter corpus. Similar methodology was used to develop Twitter Multilingual RoBERTa (XLM-T) (Barbieri et al., 2022), which is the state-of-the-art multilingual Twitter model. By using a monolingual approach, this study developed a more efficient model that outperformed XLM-T in the target language. This project explored fundamental elements of corpus construction, corpus refinement, dataset annotation, preprocessing, pre-training, fine-tuning, and benchmarking. It concludes with a demonstration that the sentiment model is valid, useful, and sensitive to changes in public sentiment that correlate with real-world events. |
| Description: | M.A.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/130108 |
| Appears in Collections: | Dissertations - InsLin - 2024 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2419LLTLIN500105075820_1.PDF | 16.58 MB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
