Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/78289
Title: Vision based speech recognition system
Authors: Farrugia, Liam (2014)
Keywords: Speech processing systems
Automatic speech recognition
Lipreading
Issue Date: 2014
Citation: Farrugia, L. (2014). Vision based speech recognition system (Master's dissertation).
Abstract: Visual Speech Recognition is a field that has, in recent years, been subject to further study since it can help in improving noise reduction in Automatic Speech Recognition systems. Lip reading in humans is used to understand speech from visual cues mainly related to the movement of lips. This project explores the idea of recognizing visemes, the basic unit of lip reading, using a computerized system. This can lead to improving results from audio-only ASRs which are affected by noisy audio or provide speech recognition where no audio is available. The project is divided in modular parts which concern lip segmentation based on colour space information, lip feature extraction, forced audio alignment and viseme classification using a number of classifiers; Artificial Neural Networks, Support Vector Machines, Hidden Markov Models and Hidden Random Conditional Fields. The outputs of the different parts were used to generate the final results, which show negligible recognition rates. This was mainly attributed to errors accumulating from one step to another, producing noise which affected the data used for training. Finally, the way forward is discussed in view of the results obtained.
Description: M.SC.ICT COMMS&COMPUTER ENG.
URI: https://www.um.edu.mt/library/oar/handle/123456789/78289
Appears in Collections:Dissertations - FacICT - 2014
Dissertations - FacICTCCE - 2014

Files in This Item:
File Description SizeFormat 
M.SC.ICT_Farrugia_Liam_ 2014.pdf
  Restricted Access
17.82 MBAdobe PDFView/Open Request a copy
Farrugia_Liam_acc.material.pdf
  Restricted Access
64.47 kBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.