Please use this identifier to cite or link to this item:
Title: Big social data : predicting users’ interests from their Social Networking activities
Authors: Engerer, Bernhardt David
Keywords: Social networks
Natural language processing (Computer science)
User-generated content
Issue Date: 2019
Citation: Engerer, B.D. (2019). Big social data : predicting users’ interests from their Social Networking activities (Master's dissertation).
Abstract: The amount of data produced by Social Network users, whether through direct content creation or as a byproduct of their Social Network usage, is ever increasing. It has been shown that it is possible to predict the demographic profile of a Social Network user based on the content they create and their similarity to other users, as well as to guess certain personality traits using the same methods. While the content created by the user may be rich in information, it is nonetheless difficult to extract meaningful knowledge from this content, mostly due to the open nature of the format of such content as well as the difficulties of Natural Language Understanding. This research presents an approach to predicting unknown user interest in entities based on Entity Extraction from User Generated Content and through the use of a Potential Link Prediction algorithm for recommendations. An algorithm was developed which is able to extract relevant entities from the microtext forming the metadata of Facebook pages liked by a user. These entities are then used in order to suggest other potentially interesting pages to the user. Additionally, crowd-sourced knowledge is used in order to automatically filter out entities which are likely to be irrelevant to future users based on pastratings. Using these filtered entities and by having at least 10 interests disclosed by a user, it is possible to predict further entities of interest to a user, with at least 80% confidence in the predictions. Despite a low number of pages being available for recommendation, over three-quarters of these entities were deemed relevant by users when suggested to them, and while there is no gold-standard dataset with which to compare this result, it was nonetheless judged to be a significant indicator of the success of the selected methodology.
Appears in Collections:Dissertations - FacICT - 2019
Dissertations - FacICTAI - 2019

Files in This Item:
File Description SizeFormat 
  Restricted Access
1.4 MBAdobe PDFView/Open Request a copy

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.