MSc in Language and Computation
BSc in Human Language Technology
Andrea De Marco, University of Malta

Title: Acoustic Approaches to Accent Identification

Date: Friday 13 May 2016 at 12:00 hrs

Venue: GW256 

There has been considerable research on the problems of speaker and language recognition from samples of speech. A less researched problem is that of accent recognition. Although this is a similar problem to language identification, different accents of a language exhibit more fine-grained differences between classes than languages. This presents a tougher problem for traditional classification techniques. This talk will go over recent work and evaluate a number of techniques for accent classification. The proposed techniques are novel modifications and extensions to state of the art algorithms, and they result in enhanced performance on accent recognition.

The bulk of the work is concerned with the application of the i-Vector technique to accent identification, which is the most successful approach to acoustic classification to have emerged in recent years. We show that it is possible to achieve high accuracy accent identification without reliance on transcriptions and without utilising phoneme recognition algorithms. The seminar will describe the various stages in the development of i-Vector based accent classification that improve the standard approaches usually applied for speaker or language identification, which are insufficient. We demonstrate that very good accent identification performance is possible with acoustic methods by considering different i-Vector projections, frontend parameters, i-Vector configuration parameters, and an optimised fusion of the resulting i-Vector classifiers we can obtain from the same data.

The overall claim is that of having achieved the best accent identification performance on the test corpus for acoustic methods, with up to 90\% identification rate. This performance is even better than previously reported acoustic-phonotactic based systems on the same corpus, and is very close to performance obtained via transcription based accent identification. The seminar will also go over the utilization of this technique for speech recognition purposes, leading to considerably lower word error rates.


Erika Hoffmann-Dilloway, Oberlin, Ohio, US

Title: Writing the Moving Body: Bivalency and Simultaneity Between Codes and Modalities in a German School for the Deaf

Date: Friday 6 May 2016 at 17:30 hrs

Venue: GWHC 

 In a German classroom for deaf students, in addition to learning to write using the Roman script, pupils write both German and German Sign Language using SignWriting, a movement writing system. One effect of inscribing these both of these languages in sound-based and movement-based scripts is to heighten students’ metalinguistic awareness of bivalencies between these codes (e.g., attracting attention to semiotic forms that occur in both languages) and simultaneities between the modalities through which they are produced (e.g., attracting attention to visual and kinetic modalities inherent in producing spoken German as well as the ways in which DGS signing practice can produce sound). This talk draws on ethnographic research conducted in 2010 and 2012 to analyze the ways in which this unusual pedagogical approach allows students who enter the class with highly diverse linguistic repertoires (in terms of the languages they use and the modalities through which they are able to access language), to draw on the semiotic forms they control in their efforts to acquire new linguistic resources.


Benjamin Saade,  University of Bremen, Germany 

Title: Romance derivation in Maltese: a project report

Date: Friday 29 April 2016 at 12:00 hrs

Venue: GW256

Maltese, an Arabic language with a heavily mixed Semitic/Romance lexicon, has been in the focus of studies concerning loan verbs (Mifsud, 1993) and the complex interplay of different systems of verbal morphology (Spagnol, 2011). Comparably few studies have been concerned with the Romance element in the Maltese derivational system with the exception of some treatment by Brincat (2012). Numerous derivational elements (mainly from Sicilian and Italian) have been integrated into the Maltese morphological system. These derivational formatives exhibit varying degrees of productivity and underlie different sets of restrictions regarding the nature of their bases (word class, phonological structure, Semitic/Romance/English origin). Firstly, my study will catalogue the different Romance derivational formatives in Maltese, taking into account their functions, restrictions and frequencies. Secondly, a quantitative study using the MLRS (Maltese Language Resource Server) corpus will compare the productivity of a selection of affixes in Maltese with the productivity of the cognate affixes in Italian, using Italian data and a variable corpus approach by Gaeta & Ricca (2006). In addition, the Maltese phenomenon of ‘pseudo-Romance loans’ (taking a Romance form but an English meaning) will be discussed in the context of the broader implications of this research on theories of language contact such as MAT vs PAT borrowing (Sakel, 2007). Ultimately, the combination of cataloguing formatives, assessing and comparing their productivity and restrictions supported by a more fine-grained analysis of phenomena connected to creativity and language contact will shed light on the following questions:

1.       How integrated are the Romance derivational formatives in Maltese (application to Semitic and Romance bases)?

2.    Is it possible to establish classes of affixes that underlie similar applicability restrictions, possibly relating to different etymological strata (earlier Sicilian vs later Italian borrowings)?

3.       Is the productivity of the formatives in Maltese comparable to the patterns found in the source languages?

4.     Can these observations lead to a better understanding of the process of affix borrowing and the underlying language contact situation (Gardani, Arkadiev & Amiridze, 2015; Matras & Sakel, 2007)?



Nizar Habash,  New York University, Abu Dhabi 

Title: Morphological Processing of Arabic and its Dialects

Date: Friday 22 April 2016 at 9:30 hrs

Venue: GW256

The Arabic language can be quite challenging for automatic processing. Arabic morphology is rich and complex and its orthography is underspecified causing a high degree of ambiguity. Arabic dialects, the primarily spoken non-standard varieties of Arabic, contribute more challenges. While some aspects of their morphology are simpler than Standard Arabic, other aspects are more complex. Additionally, they have no standard orthographies and less computational resources than Standard Arabic. In this talk, we discuss these challenges and present and demo the state-of-the-art in Arabic and Arabic dialect morphological analysis and disambiguation.   

The slides from the presentation can be downloaded here


Marc Tanti, University of Malta 

Title: A Journey in Lexical Semantics: Finding similar, substitutable, simpler words.

Date: Friday 12 Feb 2016 at 12:00 hrs

Venue: GW114


Words have meaning, but can a computer understand this

meaning? Using simple techniques, computers can tackle problems which

humans require an understanding of word meanings to solve. They can

find words which are similar in meaning using distributional

semantics; they can find words which are substitutable in a particular

context using language modelling; they can even replace difficult

words with ones which are simpler using frequency measures. How can

computers do these things? In this talk I will explain these concepts

in simple terms for a general audience as an introduction to

computational lexical semantics. 


Eric Wehrli, LATL-CUI, University of Geneva 

Title: Collocations and Anaphora Resolution in Machine Translation

Date: Wednesday 9 Dec 2015 at 15.30 hrs

Venue: GW206

Collocation identification and anaphora resolution are widely recognized as major issues for natural language processing, and particularly for machine translation. An abundant literature has been dedicated to each of those issues, but to the best of our knowledge their intersection domain – collocations in which the base term has been pronominalized – has hardly been treated.

In this talk, I will present our (modest) contribution towards filling this gap, focusing on the translation from English to French of collocations of the type verb-direct object (to break a record, to make an appointment, to make a case, to take a break, etc.), with and without pronominalization of the complement.


Tünde Polonyi, University of Debrecen, Hungary

Title: Lexical and grammatical studies of the young and the elderly whilst learning a foreign language 

Date: Friday 30 Oct 2015 at 12.00 hrs

Venue: GW104

Several studies have focused on foreign language learning, most of which have concerned the younger age group. Our research aim was to find out, which aspects of a language can be mastered successfully in case of the elderly at the beginning of the language learning process, compared to younger subjects.

In the present study 25 university students formed the young adult group (17 females, 8 males, mean age 19.32, range 18-24, SD = 2.16). The elderly group was composed of 19 participants (14 females, 5 males, mean age 61, range 56-67, SD = 3.6). All participants were native speakers of Hungarian.

The study was implemented with the help of an artificial language (see Polonyi, 2012). We used digitized cartoon drawings of animals performing different picturable actions in dyadic pairs. The animals and actions could be combined freely to create a large number of different scenes corresponding to independent clauses of the type “The dog hugs the lion”.  Participants were familiarized with this new language in a training session. During the training session images were displayed on the screen accompanied by an appropriate descriptive sentence. Participants observed the novel language and the pictures along with reading aloud the sentences below the images for three times. The training session was followed by several tasks tapping different aspects of language learning. These tasks were as follows: picture-word matching task, grammatical learning tasks and interviews. The experiment consisted of three sessions run on three consecutive days.

Our results show that there are certain implicit (unconscious) learning processes, but they may prevail mostly in relation to word learning. Incidental learning is not effective in grammar learning, not even in case of young adults. Providing examples and concrete explanations is likely to be more beneficial for the learner in the beginning of the learning process. Results also indicate that throwing the elderly in at the deep end does not help, but effective word learning mechanisms are available at older age as well: some participants achieved outstanding accuracy at word recognition and word recall.

 Keywords: foreign language learning, artificial language, lexical learning, grammatical learning, elderly and young adults

Last Updated: 12 October 2016

