Next: Midterm project Up: Language and Speech Processing Previous: Reading material

Lecture schedule 2008/9

Block A: Sep and Oct 2008 (Tentative Schedule: May Change During Course of Semester)
1. Introduction and Motivation (Why probabilistic models for language and speech processing?)
http://staff.science.uva.nl/~simaan/D-LangAndSpeech0809/D-Lectures/LSP091.pdf
Read chapters 1 and 2 of Manning &Scheutze or Jurafsky&Martin. Also read this paper: Empirical validity and technological viability: Probabilistic models of Natural Language Processing.
2. Basic Probability Theory and Statistics
http://staff.science.uva.nl/~simaan/D-LangAndSpeech0809/D-Lectures/LSP092.pdf
Main reading: chapters 1 and 2 from Manning&Scheutze
More about statistics: Read also chapter 1 of Krenn and Samuelsson http://www.ofai.at/~brigitte.krenn/papers/stat_nlp.ps.gz
More about learning: Read chapters 1 and 2 of Machine Learning (T. Mitchell).
 Free choice homework (no need to deliver): Derive the corollaries on lecture slide 8, the chain rule on slide 9 and the partition rule on slide 10. Use only the axioms and set theory to do the derivation. Let a ``word" be defined as a sequence of symbols separated by white-space. Take a large English text (for example a collection containing at least 1 million word occurences from Wikipedia) and extract all words and their counts from a steadily growing part of the text. To do so, start with 10% of the text and add another 10% every time until you have the full text to do the counting. Plot the relative frequency estimate (RFE) of the word probability for certain words (e.g. ``the" or ``man" or ``company" or ``browsing" or ``Bush") and observe whether the RFE is converging around a certain value as the data grows large. Discuss the differences between the convergence of different words with your fellow colleagues.
3. Hidden Markov Language Models
Word-prediction, sentence probability (without structure), Ngrams and Markov models, POS tagging and Hidden Markov Models.
http://staff.science.uva.nl/~simaan/D-LangAndSpeech0809/D-Lectures/LSP093.pdf
Read chapter 6 of Juranfsky and Martin or Sections 6.1-6.3 + 9.1 from Manning and Schutze.
Read chapter 8 (Jurafsky and Martin) about POS tagging in general (you may skip section 8.6)
On HMMs: read from chapter 9 (Manning and Schutze) only sections 9.1+9.2 +9.3.1+9.3.2)
Further on evaluation of Taggers: read section 10.6 (Manning and Schutze).
More on tagging see: http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=972477
4. HMM implementation as SFST; Tagging Algorithms; Forward/Backward Algorithms
Same slide file as preceding lecture (i.e. http://staff.science.uva.nl/~simaan/D-LangAndSpeech0809/D-Lectures/LSP093.pdf).
See preceding lecture for details. This one extends it.
Read also Chapter 10 of Manning and Schutze and on Spelling Correction from Jurafsky and Martin chapter 5 (till section 5.6) and chapter 6.
5. Dealing with Unseen Events: Methods for Smoothing Maximum-Likelihood Statistics
http://staff.science.uva.nl/~simaan/D-LangAndSpeech0809/D-Lectures/LSP095.pdf
Read chapter 6 from Manning and Schutze (or chapter [6.1-6.6] from Jurafsky and Martin), and then until page 18 from Joshua Goodman and Stanley Chen. "An empirical study of smoothing techniques for language modeling". Technical report TR-10-98, Harvard University, August 1998.
6. Basic Information Theory, Learning and Estimation
Also in chapter 1,2 from Manning&Scheutze; read also chapter 1 of Krenn and Samuelsson
7. First Parsing lecture
Block B: Nov and Dec 2008 (Tentative Schedule: May Change During Course of Semester)
1. Lecture
2. Lecture
3. Lecture
4. Lecture
5. Lecture
6. Lecture
7. Lecture

Next: Midterm project Up: Language and Speech Processing Previous: Reading material
Khalil Sima'an 2008-10-02