Raquel Fernandez - teaching

Computational Semantics and Pragmatics

Master of Logic :: Autumn 2011

Lecturer: Raquel Fernández (ILLC, University of Amsterdam)

Timetable: Thursdays 15-17h. In room D1.162 (period 1) and G2.04 (period 2) -- see datanose.nl

Contents and Objectives: The overall objective of the course is to introduce some of the major topics and methodologies in the study of natural language semantics and pragmatics, from an empirical and computational point of view.

Semantics and pragmatics are concerned with the study of natural language meaning and its context of use in written texts and in conversation. The computational counterparts of these disciplines address these issues from an explicitly computational point of view, combining insights from linguistic theory, computational linguistics, and artificial intelligence. The course will introduce some of the fundamental concepts in computational semantics and pragmatics, exposing students to current research in topics such as distributional lexical semantics, textual entailment, generation and resolution of referring expressions, speech acts, and dialogue modelling. Students will also get acquainted with current methodologies and techniques, such as working with annotated and unannotated corpora, and with rule-based and probabilistic methods.

Prerequisits: There are no formal prerequisites. Some knowledge of semantics or pragmatics as well as basic programming skills are useful but not required. Please fill in this student questionnaire if you intend to take the course.

Evaluation: Homework exercises (40%) and a final paper or project (40%). Before submitting the final paper, you will be asked to submit a 2-page abstract (10%) and to give a presentation (10%). Further details and the precise deadlines can be found below.

This webpage will be updated throughout the semester. Please check it regularly. Its contents are subject to change.

Date	Contents	Homework
8 September 2011	Slides. The first part of today's class was dedicated to practical matters and overview of the topics of the course. See also the Overview Bibliography. In the second part of the class, we discussed some challenges behind Textual Entailment.	HW#1 due 22 Sept. Submit a PDF by email.
15 September 2011	CLASS CANCELLED (away at CID2011) Students are strongly encouraged to attend the Computational Linguistics Seminar/DIP Colloquium on Wednesday 14 Sept at 15h. - room B1.25 (not G2.13), where Shalom Lappin will talk about Probabilistic Semantics for Natural Language. See Homework #1.
22 September 2011	CLASS CANCELLED (away at SemDial 2011)	Required reading for next class: Bos & Markert (2005) Recognising Textual Entailment with Logical Inference
27 September 2011	Slides. We discussed Bos & Markert (2005). This gave us the chance to find out about different computational methods to tackle Textual Entailment: shallow methods based on WordNet and deep methods such as automatic theorem proving.	Bonus exercise: see slides 11-13. Send me your experiences/comments by email. Readings: see slide 18.
6 October 2011	Slides. We discussed the following classic paper, which served as an introduction to distributional models of word meaning: Adam Kilgarriff (1997) I don't believe in word senses, Computers and the Humanities, 31:91-113. At the end of the lecture, we started to look into the technicalities of space vector models.	HW#2 due 17 Oct. Submit a PDF by email. (One point less per day of delay).
13 October 2011	Slides. We reviewed the parameters of DSMs and saw some examples of how these models can be empirically evaluated. In the second part of the lecture, we discussed the core theoretical assumptions and implications of DSMs. The main reference for this part of the lecture is: A. Lenci (2008) Distributional Semantics in Linguistic and Cognitive Research, in Lenci (ed.), From context to meaning: Distributional models of the lexicon in linguistics and cognitive science, special issue of the Italian Journal of Linguistics, 20(1):1-30.
20 October 2011	Slides (improved from those used in class). The lecture gave an overview of the methodology involved in using machine learning techniques for linguistic analysis and NLP tasks. We distinguished between supervised and unsupervised methods, taking as a case study WSD. We didn't have time to go over several aspects related to unsupervised learning, which we will cover next week.	Required background reading for next class: Grice (1975) and Davis (2010) on Implicature (see bibliography).
27 October 2011	Slides. We recapitulated the material on supervised learning from the previous lecture by going over the particular approach of Bos & Markert (2005) -- see check questions. We then turned to unsupervised learning, with a focus on WSD. We did not start to get into Gricean pragmatics and conversational implicature yet (I assume you did the required background reading). We ended the lecture with some comments on the final projects.	To check whether you understood the material from last week, please try to find answers to these questions before the class. You don't need to send me the answers; we'll discuss this together.
3 November 2011	NO CLASS. Think about ideas for your personal project and schedule an appointment with me to discuss them. See slides 15-17 from 27 Oct.	Send me an email to schedule an appointment. Preferred meeting times: Thu 3/Fri 4 Nov 10-13h.
10 November 2011	From now on, we will meet in room G2.04 Slides. We briefly introduced the main ideas behind Gricean pragmatics (building on the background reading on this topic) and then turn to computational explorations of conversational implicature -- today with regard to the generation of referring expressions. The following seminal paper is the main reference for this lecture: Dale & Reiter (1995) Computational Interpretations of Gricean Maxims in the Generation of Referring Expressions, Cognitive Science, 18:233-266.	HW#3 due 17 Nov by 13:00h (half a point less per hour of delay).
17 November 2011	Slides. More on computational explorations of conversational implicature, in particular indirect answers to polar questions and scalar implicature. We discussed the following recent paper in detail: de Marneffe, Manning & Potts (2010) "Was it good? It was provocative." Learning the meaning of scalar adjectives. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 167-176.	To prepare for the discussion of the paper by de Marneffe et al. (2010) try to find anwers to these questions. (No need to submit your answers.)
24 November 2011	Slides. We started to look into dialogue modelling. We reviewed the main notions behind classic speech act theory and then focused on computational approaches to speech act interpretation, which are reviewed in the following paper: Dan Jurafsky (2004) Pragmatics and Computational Linguistics. Handbook of Pragmatics, Oxford: Blackwell.
1 December 2011	Slides. We briefly discussed the main differences between BDI inference-based approaches and Information State Update approaches to dialogue, and then looked into models of interaction management, in particular grounding. There are several references on the slides.	Submit a 2-page abstract of your final paper (deadline: 1 Dec 13:00h).
8 December 2011	Slides. This last lecture focused on computational aspects related to the ISU approach to dialogue management, with emphasis on computational models of grounding. We also briefly looked at dialogue act taxonomies and did a little exercise annoating a fragment of a dialogue from the Switchboard corpus. You can see the gold standard DA annotation for that dialogue here. See also slide 18 for details on inter-annotator agreement that were not given in class.
15 December 2011	Student presentations of final projects. slots of 17 minutes (12 + 5 minutes for questions) do NOT prepare more than 10 slides (send them to me beforehand) we will start sharp on time at 15h -- please don't be late! feel free to invite fellow students to attend the workshop See below for more details.	Email me the slides of your presentation by 13:00h on the day of the workshop (in PDF or PPT)

Final projects: During the second part of the course you should work on a personal project related to the topics of the course. Pretty much anything related to computational semantics and pragmatics broadly understood counts as a possible topic. Make sure you choose something you find interesting. Here are a few ideas on possible types of projects (abstracting over particular topics):

a quantitative corpus study of some interesting phenomenon
a machine learning experiment using an existing corpus
an analysis of data collected by youself in an experiment
an analysis and small extension of a paper from the literature
an analysis of interesting connections between different approaches

Some options in the list above may seem unfeasible to you, but they may be perfectly possible -- don't abandon an interesting idea before discussing it with me! Once you have agreed with me on a topic, these are the next steps:

1 December (by 13:00h): Abstract
Submit a 2-page abstract that summarises your plans for the project. This should include a brief introduction to the topic, a succinct description of your goals, and an outline of the work to be done and done so far. Of course, it should also include a section with any references you cite in the text. You may consider formatting your abstract using the style required for the final paper (see below).
15 December: Presentation
At the end of the course we'll have a little workshop where each of you should present your project topic and your work in progress to the rest of us. Presentations should last around 10-15 minutes. Both the speakers and the audience play important roles in this session. As a speaker, you should make your project understandable to your fellow class mates (who have not read your abstract). Ideally, your presentation should include the following:
1. background and motivation behind your study (why you do it)
2. your main aim (what you plan to do), and
3. a sketch of your approach (how you'll do it).
You may want to have a look at these general tips on how to give a talk by Ulle Endriss.
As part of the audience, you should try to give useful feedback: are there aspects that are not clear? do the approach and the work plan make sense? are they reasonable? can you think of suggestions for improvement?
29 December: Final paper (***papers submitted after the 31st of December will NOT be graded***)
The final output of your project should be a paper of at most 8 pages in total, including references (between 5 and 8 pages is appropriate). The paper should be written in Latex using the style-files of EACL 2012, which you can find here.