Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/120586| Title: | Detecting counterfactuals across COVID‐19 literature |
| Authors: | Refalo, Braden (2023) |
| Keywords: | COVID-19 Pandemic, 2020-2023 Scientific literature Natural language processing (Computer science) |
| Issue Date: | 2023 |
| Citation: | Refalo, B. (2023). Detecting counterfactuals across COVID‐19 literature (Master's dissertation). |
| Abstract: | The COVID‐19 pandemic has undoubtedly resulted in a deluge of information and re‐ search. Although some studies were thoroughly evaluated, others were forcefully and hastily conducted. Consequently, conflicting findings have emerged, eroding public trust in the scientific community. Given such circumstances, the need for automated fact verification systems has progressively become crucial, especially when justifying certain claims and determining research choices. The pandemic has initially manifested as a resource‐constrained environment, marked by limited research and scarce resource. The limited availability and resource in newly emerging areas poses significant challenges for effectively training fact verification systems on such domains. Cognisant of these limitations, we employed back‐translation to generate additional claims in smaller datasets, along with other augmentation methods. The final pipeline incorporated this synthetic data to train a BART‐based transformer model for generating and predicting scientific claims from texts. This model served as a proof of concept, demonstrating modest potential in enhancing the accuracy of existing fact verification systems. On this account, we introduce BAsseRT‐Gen, a transformer model built on the BART architecture, exhibiting state‐of‐the‐art performance for scientific claim generation. It has obtained remarkable precision scores in ROUGE‐1, ROUGE‐2 and ROUGE‐L, with values of 0.7411, 0.5585 and 0.6906, respectively. BAsseRT‐Gen can be readily employed to augment scientific and low resource corpora with fluent and coherent generated claims. Through the employment of this approach, we have demonstrate a modest improvement in performance of an existing fact verification system. The study concludes with a proof‐of‐concept demonstration performed in a simulated COVID‐19 environment with limited resources, to showcase the effectiveness of the proposed approach in a zero‐shot manner. Our research contributes to the emerging and novel field of synthetic claim generation while highlighting its potential to address challenges in low‐resource environments and enhance the performance of existing fact verification systems. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/120586 |
| Appears in Collections: | Dissertations - FacICT - 2023 Dissertations - FacICTAI - 2023 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2319ICTICS520000010886_1.PDF | 6.05 MB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
