ILLC Universiteit van Amsterdam
'Home 'Contact 'Brief CV 'Publications 'Activities
'Research 'Teaching 'Misc.

Statistical Language Processing and Learning Lab.

Part of Language and Computation, ILLC, FNWI, UvA.  
Contact: Dr Khalil Sima'an

Our research concentrates on structured language processing with application to morpho-syntactic parsing and hierarchical machine translation. Our work aims at inducing and exploiting the latent structure that represents relevant salient regularities in natural language data (mono- and multi-lingual corpora) for improved language applications. Sample recent topics (see, e.g., publications or the individual pages of the group members):
  • effective syntax-based models for machine translation, including syntactic language models, syntactic and latent reordering models, synchronous grammar learning  (with H. Hassan (DCU), A. Way (DCU), M. Mylonakis and M. Khalilov),
  • representational and computational issues in parsing morphologically-rich languages (mainly Semitic - with Reut Tsarfaty, Yoad Winter (now U. Utrecht), Roy Bar-Haim (Technion) and Alon Itai (Technion)),
  • computational aspects (algorithm and complexity) and statistical learning for Data-Oriented Parsing (with R. Scha, R. Bod, A. Zollmann, L. Buratto and D. Prescher)
  • inducing rich syntactic lexica from a mix of unannotated and annotated data (with T. Deoskar and M. Mylonakis),


Researchers


Khalil Sima'an (ILLC, UvA)

Principal Investigator


Markos Mylonakis (UvA, NWO VIDI  project) 2007-11 PhD student

Gideon Wenniger (UvA, Open Competitie project)
2010-14 PhD student

Sophie Arnoult
2011-12
Research assistant


Alumni Ph.D. students

19 Jan. 2012
Markos Mylonakis (UvA, NWO VIDI): graduation 19 January 2012
Xerox Research Centre Europe
Graduated
Mar. 2010
Reut Tsarfaty (UvA, NWO MOSAIEK project): graduated 24 March 2010 Uppsala University
Graduated
Jan. 2009
Hany Hassan co-supervision together with  Andy Way  at Dublin City University, Dublin, Ireland.
Microsoft Research


Alumni postdocs



Ongoing projects:

Concluded projects:
  • Learning Stochastic Tree-Grammars from Treebanks (LeStoGram) NWO-EW Open Competitie [2003-2006], PI=Khalil Sima'an (approx. 200 kEuro)
    Project was concluded in October 2006
  • Beyond Treebank Annotations: Ambiguity Resolution by Similarity-Based Performance Models Personal innovation grant (KNAW Fellowship, Royal Dutch Academy for Arts and Sciences), awarded in 2002. (approx 200 kEuro). PI=Khalil Sima'an.
    Project was concluded in 2003