“To trust a LIAR” : does machine learning really classify fine-grained, fake news statements?

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/76880

Title:	“To trust a LIAR” : does machine learning really classify fine-grained, fake news statements?
Authors:	Mifsud, Mark (2020)
Keywords:	Fake news Machine learning Natural language processing (Computer science)
Issue Date:	2020
Citation:	Mifsud, M. (2020). “To trust a LIAR”: does machine learning really classify fine-grained, fake news statements? (Bachelor's dissertation).
Abstract:	Fake News is a contemporary problem which causes serious social harm. Early detection of fake news is therefore a critical problem in machine learning and NLP (Natural Language Processing); and a very challenging one. This study attempts to automatically classify short claims related to US politics, according to various levels of veracity, ranging from True to False to “Pants on Fire” (absolute lies). These statements, taken from the LIAR dataset, were fact checked and pre-classified by experts for Politifact.com. In order to achieve a better accuracy score than previous studies, state-of-the-art, machine learning models known as transformers were used. Transformers have previously performed significantly well on several NLP tasks. A simple neural network was also used to augment the models so that they could utilise the source’s reputation score, thus enhancing the classification. While the higher accuracy was indeed achieved, the methods’ ability to help mitigate the real-life problem of fake news’ early detection, remained in doubt. For this reason, further experiments were done on the models built to see how variations in the data they are trained on, impacts their performance. Despite higher accuracy scores, flaws in the resulting models are discussed. Flaws include bias and the fact that they do not really model veracity which makes them prone to adversarial attacks. It was noticed that the statements in the dataset would require knowledge of the real world to accurately label as either true or false. The question inevitably arose about whether purely NLP-based, fake news classification, in general, can really be used to detect deception or whether it is an ill-posed problem. A critique of this area of study was done. After scrutinising this study’s own models, previous meta-studies and psychological studies about detecting deception, the author presents the argument that purely language-based, fake news classification should be treated as an ill-posed (unstable) problem. Some potential solutions to stabilise the problem in future studies are also suggested. To measure the performance of classification models, better evaluation methods than mere accuracy scores are also determined to be necessary.
Description:	B.Sc. IT (Hons)(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/76880
Appears in Collections:	Dissertations - FacICT - 2020 Dissertations - FacICTCIS - 2020

Files in This Item:

File	Description	Size	Format
20BITSD013.pdf Restricted Access		4.26 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics