Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/135376| Title: | Towards an enhanced Arabic propaganda detection through argumentation featuresi ntegration |
| Authors: | Nabhani, Sara (2024) |
| Keywords: | Propaganda Computational linguistics Natural language processing (Computer science) Arabic language -- Political aspects Machine translating |
| Issue Date: | 2024 |
| Citation: | Nabhani, S. (2024). Towards an enhanced Arabic propaganda detection through argumentation featuresi ntegration (Master’s dissertation). |
| Abstract: | Propaganda detection has become increasingly important in an era where misinformation can rapidly influence public opinion and social dynamics. This thesis investigates the potential of enhancing propaganda detection models by integrating argumentation features, a domain that remains underexplored, particularly in low‐resource languages such as Arabic. Given the scarcity of annotated argumentation datasets in Arabic, we explore two main approaches: leveraging machine translation (MT) to utilize high‐resource language data, and experimenting with zero‐shot models, including multilingual models and large language models like GPT. Our experiments were guided by three key research questions: (1) Does augmenting propaganda detection models with argumentation features improve performance? (2) How critical is data quality in the machine translation (MT) approach, and can human‐assisted annotation corrections further enhance model outcomes? (3) Can multilingual models and GPT serve as viable alternatives in situations where annotated data is scarce? The results indicate that while argumentation features do contribute to improved propaganda detection, the extent of this improvement is highly dependent on the quality of the argumentation annotations. Models using machine translation (MT) demonstrated that the stage at which translation is applied, and the quality of both the translation and annotation projection, are crucial factors affecting model performance. Furthermore, although multilingual models and GPT showed promise, particularly in zero‐shot settings, our trained models with careful data quality management and error mitigation strategies outperformed these models in specific scenarios. This research emphasizes the importance of high‐quality data in enhancing model performance in low‐resource languages and highlights the potential of integrating contextual features like argumentation into propaganda detection models. Future work will focus on expanding datasets, refining problem formulations, and exploring additional contextual features to further advance the field. |
| Description: | M.Sc. (HLST)(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/135376 |
| Appears in Collections: | Dissertations - FacICT - 2024 Dissertations - FacICTAI - 2024 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2518ICTCSA531005083998_1.PDF Restricted Access | 2.74 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
