OAR@UM Collection:

OAR@UM Collection: https://www.um.edu.mt/library/oar/handle/123456789/85741 2026-07-15T23:49:28Z 2026-07-15T23:49:28Z A transfer learning approach to facial image caption generation generating captions of images of faces from Face2Text https://www.um.edu.mt/library/oar/handle/123456789/141973 2025-12-05T09:32:39Z 2021-01-01T00:00:00Z

Title: A transfer learning approach to facial image caption generation generating captions of images of faces from Face2Text Abstract: Current caption generation models do not adequately describe the subject’s appearance when faced with images of human faces. The creation of the Face2Text dataset led us to explore the feasibility of using transfer learning from domain-relevant models to build a model for this use. We build an encoder-decoder Convolutional Neural Network(CNN) - Long Short Term Memory(LSTM) pipeline model, employing an Attention mechanism and VG-GFace/ResNet CNNs, to compare different optimized variants and determine the suitability of generated captions from the Face2Text dataset. Comparisons are drawn through both automated metrics and human evaluation by 76 English-speaking participants. The captions generated by the VGGFace-LSTM + Attention model are closest to the Ground Truth according to human evaluation. Highest METEOR scores (0.4834) are obtained by the RGFA (ResNet, GloVe, Attention) model, the REFA (ResNet, Uninitialised Word Embeddings, Attention) model obtained the highest CIDEr and CIDErD results (1.2520 and 0.6860 respectively), whilst the best BLEU-4 results were obtained by both the RGFA and REFA models (0.2538). There is less agreement between raters and weak correlation between human evaluation and automated metrics. Qualitatively, most captions give encouraging results, although the model struggles when faced with abnormal facial images. We were successful in our main aim of developing a facial image captioning model for Face2Text using transfer learning, with generated captions being particularly detailed. Despite the results being already fit for use in some areas, possibly beneficial for image retrieval and users who are blind, this is only to be considered as a starting point, and is an encouraging result and baseline for future work. Description: M.Sc.(Melit.)

2021-01-01T00:00:00Z Comparing linguistic and visuo-linguistic representations for noun-noun compound relation classification in English https://www.um.edu.mt/library/oar/handle/123456789/141889 2025-12-03T11:24:51Z 2021-01-01T00:00:00Z

Title: Comparing linguistic and visuo-linguistic representations for noun-noun compound relation classification in English Abstract: Noun-noun compounds (NNCs), such as ‘restaurant owner’ and ‘city morgue’, are very frequent in the English language, and new ones are created regularly due to the high productivity of compounding as a word formation process. To fully understand the meaning of an NNC, we need to not only know the meaning of its parts, but also deduce the implicit semantic relationship between them. That is, we need to understand whether ‘city morgue’ means ‘a morgue made of cities’, ‘a morgue located in a city’ or something else entirely. Humans have clear intuitions about what relations can hold between the constituents of an NNC, but interpreting NNCs in a computational setting is a challenge. Accurate NNC processing is crucial for the advancement of many natural language processing tasks, including machine translation, text summarization, and natural language inference. Previous methods of computational NNC interpretation have been limited to approaches involving textual representations and linguistic features. However, research from both cognitive science and natural language processing suggests that grounding linguistic representations in vision or other modalities can increase performance on this and other tasks. Backed up by findings about human conceptual combination as well as theories of symbol grounding, our work is a novel comparison of linguistic and visuo-linguistic representations for the task of NNC interpretation. We frame NNC interpretation as a relation classification task, evaluating our approaches on a large annotated NNC dataset, with over 19,000 relationally-annotated compounds (Tratz, 2011). We employ two lines of experiments; one line explores the use of word2vec (Mikolov et al., 2013a) embeddings, compositionally combined into NNC representations in various ways, as inputs to an SVM classifier. The other line utilizes a BERT model, fine-tuned with a classifier layer on top. In both settings, we experiment with combining the textual representations with visual feature vectors obtained with a ResNet (He et al., 2016) model on images from ImageNet (Deng et al., 2009). We find that adding visual features increases performance on almost all data configurations in our SVM experiments, and that the results are statistically significant in some cases. In our BERT experiments, we find that BERT performs well on coarse-grained test data that may include previously seen constituents, but performs poorly on all other data configurations. However, adding raw ResNet feature vectors does increase BERT’s performance on the remaining settings, while normalized ResNet feature vectors contribute to little or no increase in performance. Our findings suggest that a visually grounded approach to NNC interpretation is a promising venture, and we view our novel approach as an encouraging starting point for more investigations into multimodal NNC processing. Description: M.Sc. (HLST)(Melit.)

2021-01-01T00:00:00Z The association of gender bias with BERT : measuring, mitigating and cross-lingual portability https://www.um.edu.mt/library/oar/handle/123456789/141886 2025-12-03T11:20:21Z 2021-01-01T00:00:00Z

Title: The association of gender bias with BERT : measuring, mitigating and cross-lingual portability Abstract: The development of BERT (Devlin et al., 2018) and other contextualized word embeddings (Radford et al., 2019; Peters et al., 2018) brought about a significant performance increase for many NLP applications. For this reason, contextualized embeddings are replacing standard embeddings as the semantic knowledge base in NLP systems. Since a variety of biases were previously found in standard word embeddings (Caliskan et al., 2017), it is crucial to take a step back and assess biases encoded in their replacements as well. This work focuses on gender bias in BERT, aiming to measure bias, compare this bias with real-world statistics and subsequently mitigate it. Gender bias is measured through associations between gender-denoting target words and professional terms (Kurita et al., 2019). For mitigating gender bias, we first apply Counterfactual Data Substitution (CDS) (Maudslay et al., 2019) to the GAP corpus (Webster et al., 2018) and then fine-tune BERT on these data. Since these methods for measuring and mitigating bias were originally devel- oped for English, we also adopt a cross-lingual perspective and test whether the approach is portable to German. Unfortunately, we find that grammatical gender in German strongly influences the associations between target and attribute words, which makes it impossible to measure gender bias using the same methodology applied for English. Therefore, further experiments to mitigate gender bias in the German BERT model are discarded. On one hand, we find that gender bias in the English BERT model is reflective of both real-world data and gender stereotypes. We mitigate this gender bias through fine-tuning on data to which CDS was applied. We hope that our positive results for English can contribute to the development of standardized methods to deal with gender bias in contextualized word embeddings. On the other hand, the fact that these methods do not work for German supports previous research calling for more language-specific work in NLP (Gonen et al., 2019; Sun et al., 2019). In light of BERT’s rising popularity, finding appropriate methods to measure and mitigate bias continues to be an essential task. Description: M.Sc. (HLST)(Melit.)

2021-01-01T00:00:00Z Datasets and models for authorship attribution on Italian personal writings https://www.um.edu.mt/library/oar/handle/123456789/141884 2025-12-03T11:13:45Z 2021-01-01T00:00:00Z

Title: Datasets and models for authorship attribution on Italian personal writings Abstract: Authorship Attribution (AA) is the study of identifying authors by their writing style. Over the past few years, determining the authors of online content has played a crucial role in many fields, such as online security, plagiarism detection and fake news identification. While extensive research has been done in this field for English, little investigation has focused on Italian, with the only outstanding case being the study on Elena Ferrante’s true identity. Existing research on AA focuses on texts for which a lot of data is available (i.e novels, articles), and which are not necessarily influenced by an author’s personal writing style due to editorial interventions. This study approaches the AA task in terms of Authorship Verification (AV), a binary classification task where, given two texts, the goal is to decide whether or not they are written by the same author. Following H ̈urlimann et al. (2015) and inspired by the work on blogger identification of Mohtasseb et al. (2009), we run the GLAD AV system on Italian forum comments and personal diaries. We introduce two novel datasets suitable for the AV task, which can be easily adapted to work with other AA tasks. We show the complexity of the data, and analyze the interaction between four different variables, i.e. genre, topic, authors’ gender and number of words taken into account per author. We perform intra-topic, cross-topic and cross-genre experiments and discuss the results obtained for each setting. We show that AV is feasible even with little data, but more evidence helps. Gender and topic can be indicative clues, and if not controlled for, they might overtake more specific aspects of personal style. We also show that, contrarily to what other studies have proved (Sapkota et al., 2014; Stamatatos et al., 2015), cross-topic and cross-genre experiments are comparable to intra-topic ones. Description: M.Sc. (HLST)(Melit.)

2021-01-01T00:00:00Z