Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/94043
Title: Application and viability of document classification techniques
Authors: Mercieca, Joanna (2008)
Keywords: Semantic Web
Electronic records
Machine learning
Issue Date: 2008
Citation: Mercieca, J. (2008). Application and viability of document classification techniques (Bachelor's dissertation).
Abstract: Many businesses today are finding it more feasible to transform their paper-based elements of their workflows into fully electronic workflows. Soft-copies of documents are becoming the primary source of information. This 'soft-archiving is being done to address cost and regulatory concerns. This has resulted in a bigger volume of documents that have to be sorted by individual employees. Besides being time-consuming, this sorting process tends also to be error-prone, if done manually by the employees. For the company it is also an expensive process since employees are derailed from their primary tasks doing chores outside their real job-description. An automated system that takes on this secondary yet important classification process would thus be ideal; potentially leading to an increase in productivity and efficiency. Software solutions using artificial intelligence and natural language processing techniques are emerging to classify documents into right categories. Each of these techniques has demonstrated merits and limitations. This thesis provides a broad overview of the various methodologies to classify documents, by investigating natural language techniques, more specifically Text Classification (TC) algorithms. Subsequently, based on this overview, three methods are adopted and implemented into one comprehensive 'document categoriser. Contrary, to previous implementations, the resulting system is not limited by application. In this instance, the resulting system is tested in a company business framework. The various merits and limitations of the different TC algorithms are discussed and compared with other implementations.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/94043
Appears in Collections:Dissertations - FacICT - 1999-2009
Dissertations - FacICTCS - 2008

Files in This Item:
File Description SizeFormat 
B.SC.(HONS)IT_Mercieca_Joanna_2008.pdf
  Restricted Access
13.98 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.