Navigating through a clustered search-space

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/35896

Full metadata record

DC Field	Value	Language
dc.date.accessioned	2018-11-08T08:38:30Z	-
dc.date.available	2018-11-08T08:38:30Z	-
dc.date.issued	2018	-
dc.identifier.citation	Pace, B. (2018). Navigating through a clustered search-space (Bachelor's dissertation).	en_GB
dc.identifier.uri	https://www.um.edu.mt/library/oar//handle/123456789/35896	-
dc.description	B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE	en_GB
dc.description.abstract	Results presented by a search engine to the user usually consist of a collection of relevant web pages and documents. These usually contain small snippets of each resource that the user searched for. Search-Engine Optimization or accumulated user history rank the results into a meaningful order. This ensures that on top of the list are what the search engine considers to be the most relevant results, for quick retrieval. Yet, these may fail to provide what the user requires. This may be due to inaccurate query keywords or ambiguous keywords. An example of this problem is homographs. The user will have to go through countless results with different topics to find a suitable one. Besides this, the user will have to deal with vast numbers of pages of results. This may cause the required web resource to become concealed. This may throw the user o trying to invest their time into searching further. Search result clustering can reduce this problem. This project aims at implementing a simple, easy-to-use navigation system for clustered search results based on No-K-Means, a search results clustering system. When the user submits a query, the query is submitted to Bing and the results are clustered using No-K-Means. The user can view the clusters by an automatically generated label or by the document title of a representative result in the cluster. A user can `drill-down' into a cluster, in which case an expanded query is automatically generated and submitted to Bing. The new results are clustered and presented to the user. Our approach is evaluated using a web-based application consisting of three web pages in total. The user can sign up or log in, enter the initial query and navigate through the visualization of the result clusters. The user will press a submit button when the ideal results are found. This will trigger a call for a large number of results using the initial query. A percentage of how many ideal results are present in the large set of results is calculated. Afterwards, the user is redirected to a statistics page. This page contains various useful user statistics. Examples are the number of clicks, duration, and number of drill-downs. The user has the option to start another session with updated user history. An SQL relational database stores the results, user history and user details for efficient storage and retrieval. The evaluation will also include an online questionnaire. This is used to gather extra feedback on the structure and effectiveness of this method of searching. The chosen design structure for the visualization is in the form of a hierarchical tree structure as it is one of the most efficient ways to visualize a hierarchy of nodes. Drilling down and initiating rollbacks are very intuitive on this form. It comes naturally to a user to click on leaf nodes to expand the tree and on non-leaf nodes to hide any child nodes. The evaluation was carried out with 40 users in order to analyze the performance of the visualization with regards to user interaction and compare the efficiency of the clustering system with an unclustered list of results using Subtopic Reach Time. The inclusion of user history was also tested, observing the improvement between unranked and ranked clusters while the users were going through the visualization. A feedback form was filled by every user to see whether the provided design was intuitive and user-friendly, and guided the user to their goal with ease without having to manually edit the query. The results show that the users preferred the proposed system over ranked lists of results and the user history improved the overall experience for the users, with improvements being listed as future work. The statistics also show that there are fewer overall results being examined in this visualization and the users are shown more relevant results that would be hidden in a ranked list of unclassified results. An additional advantage is that more relevant results are obtained by automatically expanding the query to include discriminatory terms on the basis of the user `drilling down' through clusters, without the user having to manually modify the query.	en_GB
dc.language.iso	en	en_GB
dc.rights	info:eu-repo/semantics/restrictedAccess	en_GB
dc.subject	Algorithms	en_GB
dc.subject	Search engines	en_GB
dc.title	Navigating through a clustered search-space	en_GB
dc.type	bachelorThesis	en_GB
dc.rights.holder	The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.	en_GB
dc.publisher.institution	University of Malta	en_GB
dc.publisher.department	Faculty of Information and Communication Technology. Department of Artificial Intelligence	en_GB
dc.description.reviewed	N/A	en_GB
dc.contributor.creator	Pace, Brian	-
Appears in Collections:	Dissertations - FacICT - 2018 Dissertations - FacICTAI - 2018

Files in This Item:

File	Description	Size	Format
18BSCIT011.pdf Restricted Access		1.27 MB	Adobe PDF	View/Open Request a copy

Show simple item record Statistics