Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/64189
Title: Automated report generation from football match commentary
Authors: Seracino, Jake
Keywords: Soccer matches
Sports journalism
Online journalism
Automatic abstracting
Natural language processing (Computer science)
Issue Date: 2020
Citation: Seracino, J. (2020). Automated report generation from football match commentary (Bachelor's dissertation).
Abstract: The sheer popularity of football means that most matches are extensively covered before as well as after the match. A considerable number of websites offer real-time commentary detailing the match play-by-play whilst it is still ongoing. Another common practice is that of producing a brief, post-game report on the main highlights which occurred throughout the match, such as goals. Needless to say, such a task is somewhat time consuming to conduct as it requires the writer to watch the match and then pen down the report. Moreover, online portals will try to produce and upload the report in as little time as possible after the match ends and when the interest about it is still at its peak. The main aim of this research is to propose a system that automatically generates a football match report from a given match’s commentary. We frame the problem as one in extractive summarization, whereby each comment in the commentary is considered as a candidate for inclusion in the final report. From each candidate, we extract a number of features so that they may be scored and ordered. These features include ones which are used in typical summarization tasks, ones proposed in similar literature and our own contribution of keyword features. Furthermore, we propose a second system which makes use of sequential rule mining to identify event sequences, denoted as episodes, within match commentaries so that they may be considered when compiling the output report. Experiments on testing data indicated that the system we proposed is indeed effective in performing its intended task. Furthermore, evaluation indicated that the system which does not consider episodes performs better than the one which does. Finally, both systems were shown to outperform all baselines on the ROUGE metrics we considered and the improvements are statistically significant.
Description: B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE
URI: https://www.um.edu.mt/library/oar/handle/123456789/64189
Appears in Collections:Dissertations - FacICT - 2020
Dissertations - FacICTAI - 2020

Files in This Item:
File Description SizeFormat 
20BITAI011 - Seracino Jake.pdf
  Restricted Access
1.48 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.