I am a post-doc and lecturer at the Institute for Logic, Language & Computation (ILLC),
University of Amsterdam, in the Language and Computation (LaCo) Group.
I recently finished my dissertation from Cornell University , Ithaca, NY.
Email : t DOT deoskar AT uva DOT nl
Tel : +31 (0)20 525 8251
Fax : +31 (0)20 525 5206
Postal Address :
P.O. Box 94242
1090 GE Amsterdam
Science Park 904
1098 XH Amsterdam
Computational Linguistics: statistical parsing, unsupervised grammar induction, richer computational models of syntax
Linguistics, Typology: syntax of SOV languages, syntax and typology of South Asian languages in general (esp. Hindi, Marathi)
I am interested in statistical models of natural language. In particular, I work on statistical parsing. I am interested in building statistical model of syntax with rich representations, particularly lexical representations, and in estimation of accurate statistics for these from annotated/unannotated data. Currently I am experimenting with using a modified version of the inside-outside algorithm along
with a treebank-trained PCFG to learn fine-grained lexical information
from large sources of data.
I am also interested in the syntax and typology of languages with
Subject-Verb-Object word order, in particular languages spoken in the subcontinent of South-Asia.
INDUCTION OF FINE-GRAINED LEXICAL
PARAMETERS OF TREEBANK PCFGS WITH
INSIDE-OUTSIDE ESTIMATION AND LEXICAL
Advisor: Mats Rooth web
Other committe members: Lillian Lee web , John Whitman web
Minor: Cognitive Science Cognitive Science at Cornell
Building enhanced and fine-grained Treebank-based unlexicalized PCFGs (Cornell)
Semi-supervised learning of fine-grained lexical categories using Inside-Outside (Cornell/ILLC)
Tejaswini Deoskar, Mats Rooth and Khalil Sima'an. 2009. Smoothing PCFG Lexicons. Proceedings of the 11th International Workshop on Parsing Technologies (IWPT), Paris, France. [pdf]
Deoskar, Tejaswini. 2008. Re-estimation of Lexical Parameters for Treebank PCFGs. Proceedings of COLING 2008, Manchester, UK.
Deoskar Tejaswini and Rooth, Mats. 2008. Induction of Treebank-Aligned
Lexical Resources. Proceedings of Sixth International Conference on
Language Resources and Evaluation. Marrakech. Morocco. [pdf]
Deoskar, Tejaswini and Rooth, Mats. 2007. Corpus Induction of Lexicons for
Treebank PCFGs by Inside-Outside Estimation and Frequency
Transformations. Ms. [pdf]
Deoskar, Tejaswini. 2006. Marathi Light Verbs. Proceedings
of the 36th Annual Meeting of the Chicago Linguistics Society. [pdf]
An empirical study on the phonological adaptation of speakers of Indian English when
exposed to American English. [pdf]
A paper on Serial Verbs in Khoekhoe [pdf]
Semester I (Fall 2009, Sept-Dec):
Elements of Language Processing and Learning (Master of AI, Master of Logic)
Semester II (Spring 2010, Feb-May):
Statistical Structure in Language Processing (Core course, Master of AI, Natural Language Processing track) course website
Natuurlijke Taalverwerking (Bachelor Informatica) A Bachelor's course on Language modelling.
UvA Blackboard website
Semester I (Fall 2008, Sept-Dec):
Language and Speech Processing (co-taught with Khalil Sima'an) course website
Semester II (Spring 2009, Feb-May):
Probabilistic Grammars and Data-Oriented Parsing course website
Taalmodellen - A Bachelor level course on language modelling. UvA Blackboard website
"Smoothing fine-grained PCFG lexicons" Talk at IWPT 2009, Paris.
"Identifying fine-grained lexical categories: Supertaggers versus Parsers" August 2009, Talk at Microsoft Research, Bangalore.
"Impact of lexical probabilities on adapting a PCFG to a new domain" at The 19th Meeting of Computational Linguistics in The Netherlands (CLIN), Groningen.
"Induction of Treebank-Aligned Lexical Resources. LREC 2008
"Estimation of lexical probabilities for Treebank PCFGs" May 2008. Talk at Institute for Logic, Language and Computation (ILLC), University of Amsterdam.
"Re-estimation of Lexical Parameters for Treebank PCFGs" COLING 2008
"EM-based clustering of Local Syntactic Contexts of words", Sept.2007 Talk at the NLP Seminar, Cornell University.
"Marathi Light Vers" Talk at Chicago Linguistics Society, CLS 42.
"Disambiguation of Small Clause versus Ditransitive Verbs in a Treebank PCFG" Talk at Department of Linguistics, Cornell University.
Non-academic Work Experience
2001 to 2002: Worked on creating "meta-directories" that integrated information from telecom
devices and telecom databases, using LDAP (Lightweight Directory Access Protocol).
Also worked as a consultant to design and deploy metadirectory products (Meta-Connect).
1997 to 2001: Worked as an embedded systems designer to build a CCD Camera controller for
the IUCAA telescope at Giravali (Maharashtra, India) using the Analog Devices (ADSP) DSP microprocessor family. Also interfaced the CCD
camera controller to a Linux network for programmable camera control and image acquisition (very cool, wrote linux devicedrivers for this).
1996 to 1997: Network management of Sun/Silicon Graphics/PC unix/windows networks.
Cornell NLP Page
IUCAA Observatory, Giravali, India