Text Analytics

Text Analytics: Statistical Methods in NLP


This course introduces some of the central themes and techniques that have emerged in statistical methods for language technologies and natural language processing. While many early NLP systems relied heavily on hand-crafted rules, during the past ten years a great deal of progress has been made using probabilistic methods that automatically and implicitly learn about language by extracting statistics from large quantities of text, thus reducing the knowledge acquisition bottleneck. As the computational power of computers increases, and as more natural language data becomes available online, statistical methods will become increasingly attractive and powerful in the future.

The seminar provides detailed coverage of current techniques, their strengths and limitations, and current research directions by including recent research papers. In the course of the seminar, students will acquire key skills like the fundamentals in academic research and scientific writing, and they will be encouraged to improve their presentation skills.


Manning, C. D. & Schütze, H. (1999). Foundations of Statistical Natural Language Processing , The MIT Press, Cambridge, Massachusetts .

For each chapter, current research papers will be discussed in class.

Prerequisites and Preparation

Every student is advised to read the introductory chapters (ch.1 – ch.4) of Foundations of Statistical Natural Language Processing prior to the first seminar session. It is expected that the texts have been thoroughly worked through by the fourth session at the latest.


Each student is expected to

  • give a 30 min. talk in class + 15 min. Q&A afterwards
  • write a term paper
  • show active participation in class

Materials and Forum

The course management system is used as the primary communication platform for the seminar and also contains any related material. The access key will be provided in the first seminar session.

For general advice on presenting your topic, please have a look at these guidelines.


The seminar takes place Tuesdays, 14:30 – 16:00, S2|02 A126.

  • Admin + topic assignment: 12.04.2011
  • Introductory session: 19.04.2011
  • Invited Talk by Prof. Dr. Uwe Quasthoff 26.04.2011
  • 1st presentation: 03.05.2011
  • Office hours: Tuesdays, 13.30 – 14.30, S2|02 B111.
    Please send an email in advance!


  • Oliver Ferschke, M.A.
  • Prof. Dr. Iryna Gurevych