Please use this identifier to cite or link to this item:
Title: Achieving maximum reading coverage using ontology mining
Authors: Dalli, Jake J.
Keywords: Ontologies (Information retrieval)
Genetic algorithms
Approximation algorithms
Issue Date: 2015
Abstract: The purpose of this project is to address the problem of news article selection for maximum variety and minimum reading. The project presents two primary objectives; to explore the modeling and automatic extraction of concepts from embodied language; and to explore a method for optimal article selection from a query. In this project we extract and structure data using known web scraping methods, template removal algorithms and named entity recognition tools. Finally, we conceptualize news articles as sets of named entities, and explain how this problem is essentially the minimum set cover problem, which we approximate. The core contribution of this project is a solution which aggregates web documents, removes template code, structures text using named-entity recognition and a genetic algo- rithm which approximates the minimum set cover problem.
Description: B.SC.IT(HONS)
Appears in Collections:Dissertations - FacICT - 2015

Files in This Item:
File Description SizeFormat 
  Restricted Access
2.09 MBAdobe PDFView/Open Request a copy

Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.