Academic Year 2011/2, Semester 2

Multimodal Communication, Corpora and Systems


Dr Patrizia Paggio

Institute of Linguistics 

University of Malta

The purpose of this course is to introduce the students to the inter-disciplinary area of multimodal research and make them familiar with methods for the collection, annotation and analysis of annotated multimodal data. The course will be a mixture of lectures and practical work. It consists of three parts, as detailed below.


Interaction of speech and non-verbal behavior in human communication

In the first part of the course, theoretical approaches to multimodal communication from the literature will be presented and examples from video material will be shown and discussed.  The main focus will be on methodologies for the classification of different types of non-verbal behaviour (hand gestures, head movements, body posture, facial expressions).  The interaction of gesture with speech, and how it can be modelled in cognitive or linguistic terms, will be discussed. Related topics that may be included in the course are the use of gesture in sign language as well as cultural differences in non-verbal behaviour. 

Annotation models and procedures

In this part, various approaches to gesture annotation, from those implying a detailed analysis of gesture shape to those based on categorical or functional interpretation, will be described. Issues related to the identification, segmentation and interpretation of multimodal  contributions, including how to measure inter-annotator agreement, will be discussed. The  students will be working with the annotation of selected examples using the ANVIL annotation tool (

Multimodal corpora, analyses and systems

The final part of the course deals with existing annotated multimodal corpora. It will be  shown how these corpora have been analysed to gain insight into various aspects of  multimodal communication. It will be discussed how machine learning experiments can be conducted on multimodal data. Finally, examples will be given of how multimodal data are used to develop multimodal interfaces, especially those relying on the use of talking heads and embodied conversational agents.



Last Updated: 31 August 2012

