Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/93932| Title: | A hyper link clustering system |
| Authors: | Dimech, Gabriel (2010) |
| Keywords: | Web sites -- Design Webometrics Cluster analysis -- Computer programs |
| Issue Date: | 2010 |
| Citation: | Dimech, G. (2010). A hyper link clustering system (Bachelor's dissertation). |
| Abstract: | Many web sites that contain a list of links, very often list the links according to the time of creation, source or according to another criterion which is not always informative. For example, news aggregation web pages may contain several links that point to sets of similar or related news stories, but these are sorted according to time of link creation, rather than being clustered according to news story. Search engine results pages for an ambiguous query may similarly contain several links to web pages relevant to the different meanings of the query terms but these are sorted according to relevance to the query, rather than being clustered by topic. This may cause a reader of such a page to either unnecessarily visit web pages that are very similar to each other, or, due to related links being interspersed amongst non-related links, cause a reader to overlook links that they should follow. We have built a system that clusters links in a list according to the type of content (or category) giving the user a more clear indication about the content of the main page. Developing a web link clustering system involved integrating into the web environment, a comprehensive and efficient clustering algorithm. The system allows the user to specify a list of links to cluster (from a web page), and displays the same list clustered according to content. The main algorithm for clustering is adopted from other existing techniques and adapted to cater for our requirements. The final product makes a web page containing web articles more readable and eliminates repetition of similar articles in a list. In order to evaluate the 'effectiveness' of the clusters produced by the system, evaluation is done by gathering feedback from users. This method of evaluation required some time to complete as it is not automated; however we were able to determine the effectiveness of our system from ambiguous user judgement. Although clustering is based only on the snippets of links, results have shown that approximately 68% of the clusters created by the system have been judged 'effective' by evaluators. Evaluation has also suggested that around 89% of the snippets assigned to clusters have been judged very well matching or moderately matching. |
| Description: | B.Sc. IT (Hons)(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/93932 |
| Appears in Collections: | Dissertations - FacICT - 2010 Dissertations - FacICTCS - 2010-2015 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| B.SC.(HONS)IT_Dimech_Gabriel_2010.PDF Restricted Access | 11.47 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
