Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/91646
Title: Automatic speech recognition in the assessment of child speech
Other Titles: Manual of clinical phonetics
Authors: Buttigieg, Loridana
Grech, Helen
Fabri, Simon G.
Attard, James
Farrugia, Philip
Keywords: Automatic speech recognition
Speech perception
Speech perception in children
Speech processing systems
Issue Date: 2021
Publisher: Routlegde
Citation: Buttigieg, L., Grech, H., Fabri, S. G., Attard, J., & Farrugia, P. (2021). Automatic speech recognition in the assessment of child speech. In M. J. Ball (Ed.), Manual of clinical phonetics (pp. 508-514). Routledge
Abstract: Speech sound disorders (SSD) is an umbrella term referring to any difficulty or combination of difficulties with perception, motor production or phonological representation of speech sounds and speech segments (ASHA, 2020). SSD can be organic or functional in nature (ASHA, 2020): functional SSD refer to idiopathic disorders while organic SSD have a known cause, often reflecting anatomical, physiological or neurological deficits. Speech language pathologists (SLPs) are trained to make use of phonetic transcription to record and evaluate their clients’ speech production abilities (McLeod & Verdon 2017; Ray, 2014). The International Phonetic Alphabet (IPA) (International Phonetic Association, revised to 2005) is used to transcribe the clients’ speech in terms of the place and manner of articulation. It is critically important that SLPs who assess and treat SSD and researchers who study children’s speech production, make accurate judgments and measures of children’s speech (Munson, Schellinger & Carlson, 2012). Since manual phonetic transcriptions have been reported to be time-consuming, costly and prone to error (Cucchiarini & Strik, 2003), automatic procedures have the potential to offer a quicker, cheaper and more accurate alternative (Van Bael, Boves, van den Heuvel & Strik, 2007). In fact, researchers have been investigating ways of automating the process of phonetic transcriptions, for example by utilizing speech recognition algorithms and automatic speech recognition (ASR) systems (Cucchiarini & Strik, 2003). ASR refers to the process by which a machine recognizes and acts upon an individual’s spoken utterance (Young & Mihailidis, 2010). An ASR system typically consists of a microphone, computer, speech recognition software and some type of audio, visual or action output (Young & Mihailidis, 2010). The automatic conversion of speech to text is one of the most popular ASR applications (Young & Mihailidis, 2010). ASR systems require algorithms to analyse pauses between syllables, relative syllable stress between strong and weak syllables and phonemic accuracy devices which recognize the child’s speech as correct or incorrect. ASR of adults’ speech has improved significantly lately, yet less progress has been made in recognizing the speech of young children (Shivakumar, Potamianos, Lee & Narayanan, 2014; Yeung & Alwan, 2018). In fact, ASR for children is still a poorly understood area (Benzeghiba et al., 2007; Yeung & Alwan, 2018), particularly because children’s speech is highly variable. Children’s voices and speech differ through development and across age groups (Gong et al. (2016). Additionally, both the pronunciation and rate of speech of children differ immensely from those of adults (Gong et al., 2016; Knowles et al., 2015). It has been shown that children younger than 10 exhibit more variations in vowel durations and larger suprasegmental features (Benzeghiba et al., 2007). Consequently, many existing speech recognition algorithms based on adults may not transfer well to children (Gerosa, Giuliani & Brugnara, 2007). The high variability in the typical children’s speech suggests that differentiating children with impairments from those who are typically developing is a much more challenging task (Gong et al., 2016).
URI: https://www.um.edu.mt/library/oar/handle/123456789/91646
ISSN: 9780367336295
Appears in Collections:Scholarly Works - FacEngESE

Files in This Item:
File Description SizeFormat 
Automated Speech Recognition in the Assessment of Child Speech.pdf
  Restricted Access
9.13 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.