Assessing Biometric Systems: an Overview of the NIST Series of Speaker Recognition Evaluations and Technologies

Joaquin Gonzalez-Rodriguez
Universidad Autonoma De Madrid, Spain


The testability of systems designed to identify people from biometric traces is a key feature of both commercial and forensic identification systems or procedures. Speaker recognition, or the identification of talkers from their voices, is a specially challenging domain where the identity information is not directly accessible from the observed features, making speaker modelling a complex and highly flexible task, as different sources of speaker specific information can be extracted and modelled. Moreover, the speech signal itself is subject to multiple sources of variability between different recordings from the same speaker, ranging from intrinsic variability due to the speaker (manner of speaking, type of conversation, emotional state, interaction with listener, dialect, sociolect, spoken language in bilingual speakers, health conditions etc.) to extrinsic variability due to external factors (different microphones and/or recording devices, noise and reverberation, distance and position from microphones, coding and transmission artifacts, etc.). Fortunately, the US NIST-sponsored series of speaker recognition evaluations carried out from 1996 to 2012 has been a succesful story of development and evaluation of such systems, where the conjunction of financial support, task definition and high commitment from the scientific community resulted in more than a decade of impressive yearly progress. But far from focusing in the progress of the technology, which we will only highlight, we will focus in this presentation in the challenges that systems had to face from eval to eval, with special relevance to threshold-independent evaluation of the goodness of biometric detectors, which enable systems to work in any operating point (as result of new cost functions or values of function parameters) through a separation of the discrimination and calibration capabilities of the systems. However, data-dependent calibration arise new questions and challenges that will be briefly highlighted and open to discussion. 

Joaquin Gonzalez-Rodriguez

Joaquin Gonzalez-Rodriguez, received the M.S. degree in 1994 and the Ph.D. degree “cum laude” in 1999, both in electrical engineering, from Univ. Politecnica de Madrid (UPM), Spain. After 15 years of research and lecturing at UPM, he moved in May 2006 as an Associate Professor to the Computer Science Department at Univ. Autonoma de Madrid (UAM), Spain. Since May 2011, he is a Full Professor in the Electronic and Communications Technologies Department at UAM, where he leads the Speech&Audio group of ATVS-Biometric Recognition Group, which has been producing proprietary implementations of state-of-the-art speaker and language recognition technology during the last decade. He has led ATVS-UAM participations in several NIST Speaker (2001, 2002, 2004, 2005, 2006, 2008, 2010 & 2012) and Language (2005, 2007, 2009 & 2011) Recognition Evaluations. He is a member of ISCA (International Speech Communication Association) and the Signal Processing Society of IEEE, and since 2000 is an invited member of the FSAAWG (Forensic Speech and Audio Analysis Working Group) in ENFSI (European Network of Forensic Science Institutes), and has focused his research work on speaker and language recognition, and the proper use of Automatic Speaker Recognition in Forensic Science. In September 2008, he addressed a keynote plenary talk at Interspeech 2008 in Brisbane (Australia) entitled “Forensic Automatic Speaker Recognition: Fiction or Science?”. In March 2009 he received a Google Research Award for the project entitled “Exploiting prior knowledge for robust recognition and indexing of audio information sources”. From July 2010 to July 2011, he was a Visiting Scholar at University of California at Berkeley, working as visiting scientist at the International Computer Science Institute (ICSI). He has led several national and European public-funded peer-reviewed research projects and over 25 privately funded research & development and technology contracts in the last decade.