Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/93932
Title: A hyper link clustering system
Authors: Dimech, Gabriel (2010)
Keywords: Web sites -- Design
Webometrics
Cluster analysis -- Computer programs
Issue Date: 2010
Citation: Dimech, G. (2010). A hyper link clustering system (Bachelor's dissertation).
Abstract: Many web sites that contain a list of links, very often list the links according to the time of creation, source or according to another criterion which is not always informative. For example, news aggregation web pages may contain several links that point to sets of similar or related news stories, but these are sorted according to time of link creation, rather than being clustered according to news story. Search engine results pages for an ambiguous query may similarly contain several links to web pages relevant to the different meanings of the query terms but these are sorted according to relevance to the query, rather than being clustered by topic. This may cause a reader of such a page to either unnecessarily visit web pages that are very similar to each other, or, due to related links being interspersed amongst non-related links, cause a reader to overlook links that they should follow. We have built a system that clusters links in a list according to the type of content (or category) giving the user a more clear indication about the content of the main page. Developing a web link clustering system involved integrating into the web environment, a comprehensive and efficient clustering algorithm. The system allows the user to specify a list of links to cluster (from a web page), and displays the same list clustered according to content. The main algorithm for clustering is adopted from other existing techniques and adapted to cater for our requirements. The final product makes a web page containing web articles more readable and eliminates repetition of similar articles in a list. In order to evaluate the 'effectiveness' of the clusters produced by the system, evaluation is done by gathering feedback from users. This method of evaluation required some time to complete as it is not automated; however we were able to determine the effectiveness of our system from ambiguous user judgement. Although clustering is based only on the snippets of links, results have shown that approximately 68% of the clusters created by the system have been judged 'effective' by evaluators. Evaluation has also suggested that around 89% of the snippets assigned to clusters have been judged very well matching or moderately matching.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/93932
Appears in Collections:Dissertations - FacICT - 2010
Dissertations - FacICTCS - 2010-2015

Files in This Item:
File Description SizeFormat 
B.SC.(HONS)IT_Dimech_Gabriel_2010.PDF
  Restricted Access
11.47 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.