Speaker: Paola Merlo (University of Geneva)
Title: Quantitative Computational Syntax: some case studies
Time: Thursday, 3/21, 5-6pm
Location: 46-5165
Abstract: In the computational study of intelligent behaviour, the domain of language is distinguished by the complexity of the representations and the sophistication of the domain theory that is available. It also has a large amount of observational data available for many languages. The main scientific challenge for computational approaches to language is the creation of theories and methods that fruitfully combine large-scale, corpus-based approaches with the linguistic depth of more theoretical methods. I report here on some recent and current work on word order universals and argument structure that exemplifies the quantitative computational syntax approach. First, we demonstrate that typological frequencies of noun phrase orderings, universal 20, are systematically correlated to abstract syntactic principles at work in structure building and movement. Then, we investigate higher level structural principles of efficiency and complexity. In a large-scale, computational study, we confirm a trend towards minimization of the distance between words, in time and across languages. In the third case study, much like the comparative method in linguistics, cross-lingual corpus investigations take advantage of any corresponding annotation or linguistic knowledge across languages. We show that corpus data and typological data involving the causative alternation exhibit interesting correlations explained by the notion of spontaneity of an event. Finally, time permitting, I will discuss current work investigating on whether the notion of similarity in the intervention theory of locality is related to current notions of similarity in word embedding space.
Bio:
Paola Merlo is faculty in the linguistics department of the University of Geneva. She is the head of the interdisciplinary research group Computational Learning and Computational Linguistics (CLCL). The group is concerned with interdisciplinary research combining linguistic modelling with machine learning techniques. The scope of her current research includes fundamental issues in the statistical nature of language, empirical evaluations of linguistic proposal about the lexical semantics of verbs and language universals of word order and statistical models of syntactic and semantic parsing. Prof. Merlo has been the editor of the journal of the Association for Computational Linguistics, Computational Linguistics, and has been member of the executive committee of the EACL and of the ACL. Prof. Merlo holds a doctorate in Computational Linguistics from the University of Maryland, USA. She has been associate research fellow at the Institute for Cognitive Science at the University of Pennsylvania, and has been visiting scholar at Rutgers, Edinburgh, and Stanford.