Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/80625
Title: Detecting gender bias and cross-cultural differences from text
Authors: Lagaa, Muhanned B. (2021)
Keywords: English language -- Data processing
Arabic language -- Data processing
Discrimination in language
Language and culture
Machine learning
Issue Date: 2021
Citation: Lagaa, M.B. (2021). Detecting gender bias and cross-cultural differences from text (Bachelor's dissertation).
Abstract: This study is centered on detecting cultural differences and gender bias in text. By building a word embedding model that is trained over a text of corpora in two different languages which are Arabic and English. These models will then be tested to obtain results from all the corpora used. Three different corpora will be used in each model which are News, Wikipedia and Twitter corpus. There are two sets of words that will be checked. The first set is the emotion words such as honorable and fear. And profession words such as nurse and teacher. A survey will be conducted for gender_bias which aims to look at the cultural difference as the participants are two groups, one of English speakers and of the Arabic speakers. So the results of the models can be compared to the results of the survey. The study presents interesting results and shows the task of detecting gender_bias automatically in text. It is still based on a relatively limited sample of corpora, such that these results can be improved if they were trained and tested in a large corpora. However, the results still showed gender_bias, cultural differences and the importance of the corpus used as the type of the corpus can be biased.
Description: B.Sc. (Hons) HLT (Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/80625
Appears in Collections:Dissertations - InsLin - 2021

Files in This Item:
File Description SizeFormat 
Muhanned Bashir Lagaa.pdf
  Restricted Access
999.89 kBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.