Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/95573
Title: Maltese text recognition tool for those with disability to access text
Authors: Fenech, Emma (2021)
Keywords: Neural networks (Computer science) -- Malta
Text-to-speech software
Speech synthesis
Optical Character Recognition -- Malta
Image processing
Students with disabilities -- Malta
Learning disabilities -- Malta
Issue Date: 2021
Citation: Fenech, E. (2021). Maltese text recognition tool for those with disability to access text (Bachelor’s dissertation).
Abstract: Currently, people with disability to access printed Maltese text require the assistance of another human reading aloud to them. This is not the case for text in English, since commercial text readers are available to carry out the same task. This, therefore, creates a disadvantage for those pursuing the Maltese language, making them very dependable on someone else, especially in examination scenarios. Therefore, this project was aimed at mitigating this inequality, by designing and developing a Maltese text recognition tool. The main objectives were accepting photographs of a document containing printed Maltese text, recognizing said text, and automatically speaking it out loud to the user. Initially, a preliminary pipeline was developed to be able to test and understand the limitations of these pre-existing technologies, in English. Moreover, to make this tool more widely accessible, it was decided to use a common mobile phone camera as the acquisition device, rather than the less accessible scanners. This, though, brings about certain image deformities, such as local shadows, rotation of the text lines, and perspective warping. Therefore, it was necessary to also insert an additional image preprocessing algorithm within the pipeline mentioned above to ensure that the image being recognized has as few artifacts as possible. Eventually, one Maltese trained data model of the Tesseract Optical Character Recognition engine, aided by the aforementioned image preprocessing techniques, resulted in high quality recognition, when evaluated through the Levenshtein distance calculator, and was chosen to be used in the project. A custom Maltese text-to-speech neural network, though, had to be created from scratch, based on the Deep Convolutional Text-to-Speech network architecture. This was trained using the appropriate training data, and fine-tuned until acceptable speech could be synthesized. Through a crowdsourced survey, this audio scored a promising 3.40 when compared to the 4.35 score of the ground truth. Once all project segments were confirmed to be fully functional, the whole end-to-end project was integrated as one, and the final tool was complete.
Description: B.Eng. (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/95573
Appears in Collections:Dissertations - FacEng - 2021
Dissertations - FacEngSCE - 2021

Files in This Item:
File Description SizeFormat 
Fenech Emma.pdf
  Restricted Access
20.27 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.