Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/95573
Title: | Maltese text recognition tool for those with disability to access text |
Authors: | Fenech, Emma (2021) |
Keywords: | Neural networks (Computer science) -- Malta Text-to-speech software Speech synthesis Optical Character Recognition -- Malta Image processing Students with disabilities -- Malta Learning disabilities -- Malta |
Issue Date: | 2021 |
Citation: | Fenech, E. (2021). Maltese text recognition tool for those with disability to access text (Bachelor’s dissertation). |
Abstract: | Currently, people with disability to access printed Maltese text require the assistance of another human reading aloud to them. This is not the case for text in English, since commercial text readers are available to carry out the same task. This, therefore, creates a disadvantage for those pursuing the Maltese language, making them very dependable on someone else, especially in examination scenarios. Therefore, this project was aimed at mitigating this inequality, by designing and developing a Maltese text recognition tool. The main objectives were accepting photographs of a document containing printed Maltese text, recognizing said text, and automatically speaking it out loud to the user. Initially, a preliminary pipeline was developed to be able to test and understand the limitations of these pre-existing technologies, in English. Moreover, to make this tool more widely accessible, it was decided to use a common mobile phone camera as the acquisition device, rather than the less accessible scanners. This, though, brings about certain image deformities, such as local shadows, rotation of the text lines, and perspective warping. Therefore, it was necessary to also insert an additional image preprocessing algorithm within the pipeline mentioned above to ensure that the image being recognized has as few artifacts as possible. Eventually, one Maltese trained data model of the Tesseract Optical Character Recognition engine, aided by the aforementioned image preprocessing techniques, resulted in high quality recognition, when evaluated through the Levenshtein distance calculator, and was chosen to be used in the project. A custom Maltese text-to-speech neural network, though, had to be created from scratch, based on the Deep Convolutional Text-to-Speech network architecture. This was trained using the appropriate training data, and fine-tuned until acceptable speech could be synthesized. Through a crowdsourced survey, this audio scored a promising 3.40 when compared to the 4.35 score of the ground truth. Once all project segments were confirmed to be fully functional, the whole end-to-end project was integrated as one, and the final tool was complete. |
Description: | B.Eng. (Hons)(Melit.) |
URI: | https://www.um.edu.mt/library/oar/handle/123456789/95573 |
Appears in Collections: | Dissertations - FacEng - 2021 Dissertations - FacEngSCE - 2021 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Fenech Emma.pdf Restricted Access | 20.27 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.