Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/22577
Title: | Automatic clustering of news reports |
Authors: | Azzopardi, Joel |
Keywords: | Document clustering Cluster analysis -- Data processing Cluster analysis -- Computer programs News Web sites |
Issue Date: | 2007 |
Publisher: | University of Malta. Faculty of ICT |
Citation: | Azzopardi, J. (2007). Automatic clustering of news reports. 5th Computer Science Annual Workshop (CSAW’07), Msida. 11-23. |
Abstract: | The automatic clustering of news reports from various web-based news sites into clusters according to the event they cover serves not only to facilitate browsing of news reports by a users but may also serve as an initial stage in other complex systems such as Multi-Document Summarization systems or Document Fusion systems. In contrast to the usual scenarios of document clustering whereby the document collections are static or quasi-static, news sites are continuously updated with re- ports concerning new events. Here, we present a News Report Clustering system which is able to receive a stream of news reports which it clusters on the fly according to the event they cover. New clusters are automat- ically created as necessary for news reports which are covering ‘new’, previously unreported events. We compare the results of our system to the results produced by a standard K-Means clustering system, and we show that our system performs significantly better than the standard K- Means system even though the K-Means system was supplied with the correct number of clusters that should be produced. In fact, our clustering system obtained an average of 11.95% better recall, 28.68% better precision and 0.89% less fallout than the standard K-Means clustering system. |
URI: | https://www.um.edu.mt/library/oar//handle/123456789/22577 |
Appears in Collections: | Scholarly Works - FacICTAI Scholarly Works - FacICTCS |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Proceedings of CSAW’07 - A2.pdf | 296.91 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.