Foundations of Statistical Natural Language Processing

Foundations of Statistical Natural Language Processing


This course introduces some of the central themes and techniques that have emerged in statistical methods for language technologies and natural language processing. While many early NLP systems relied heavily on hand-crafted rules, during the past ten years a great deal of progress has been made using probabilistic methods that automatically and implicitly learn about language by extracting statistics from large quantities of text, thus reducing the knowledge acquisition bottleneck. As the computational power of computers increases, and as more natural language data becomes available online, statistical methods will become increasingly attractive and powerful in the future.

The seminar provides detailed coverage of current techniques, their strengths and limitations, and current research directions by including recent research papers. In the course of the seminar, students will acquire key skills like the fundamentals in academic research and scientific writing, and they will be encouraged to improve their presentation skills.

Core Literature

Manning, C. D. & Schütze, H. (1999). Foundations of Statistical Natural Language Processing , The MIT Press, Cambridge, Massachusetts .

For each chapter, current research papers will be discussed in class.

Course Preparation

Every student is advised to read the introductory chapters (ch.1 – ch.4) of Foundations of Statistical Natural Language Processing prior to the first seminar session. It is expected that the texts have been thoroughly worked through by the fourth session at the latest.


Each student is expected to

  • give a 30 min. talk in class + 15 min. Q&A afterwards
  • write a term paper
  • show active participation in class


If you plan to participate in this seminar, please register here.


The seminar is cancelled due to a lack of attendence. It will, however, be offered again in the future.

The seminar takes place Tuesdays, 15:20 – 17:00, S2|02 E202.

  • Admin + topic assignment: 19.10.2010
  • Introductory session: 26.10.2010
  • Q&A session about presentation topics: 02.11.2010
  • 1st presentation: 09.11.2010
  • Office hours: Tuesdays, 13.00 – 14.00, S2|02 D216. Please send an email in advance!

Note: On 17.11. and 07.12. the seminar will be in room A126

Materials and Forum

The course management system is used as the primary communication platform for the seminar and also contains any related material. The access key will be provided in the first seminar session.

For general advice on presenting your topic, please have a look at these guidelines.


  • Prof. Dr. Iryna Gurevych
  • Oliver Ferschke, M.A.