Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/10569
Title: | Achieving maximum reading coverage using ontology mining |
Authors: | Dalli, Jake J. |
Keywords: | Ontologies (Information retrieval) Genetic algorithms Approximation algorithms |
Issue Date: | 2015 |
Abstract: | The purpose of this project is to address the problem of news article selection for maximum variety and minimum reading. The project presents two primary objectives; to explore the modeling and automatic extraction of concepts from embodied language; and to explore a method for optimal article selection from a query. In this project we extract and structure data using known web scraping methods, template removal algorithms and named entity recognition tools. Finally, we conceptualize news articles as sets of named entities, and explain how this problem is essentially the minimum set cover problem, which we approximate. The core contribution of this project is a solution which aggregates web documents, removes template code, structures text using named-entity recognition and a genetic algo- rithm which approximates the minimum set cover problem. |
Description: | B.SC.IT(HONS) |
URI: | https://www.um.edu.mt/library/oar//handle/123456789/10569 |
Appears in Collections: | Dissertations - FacICT - 2015 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
15BSCIT027.pdf Restricted Access | 2.09 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.