Natural Language Processing and the Web

Natural Language Processing and the Web

Teaching Staff

  • Prof. Dr. Iryna Gurevych
  • Dr. Thomas Arnold
  • Gisela Vallejo

We currently do not have fixed office hours, so please contact us by mail to get an appointment.


  • Lecture: Tuesday 09:50-11:30, Room S202 / C120 starting October 22
  • Practice class: Thursday 16:15-17:55, Room S202/C120 starting October 24

The learning material is available from the Moodle eLeaning platform.


If you plan to participate in this course, please register on Tucan.


To pass, each student has to take the written exam at the end of the semester.

There will also be a project in the practice class which will contribute to your overall grade.


  • Date/Time: March 3rd 2020, 15:00 – 17:00
  • Room: S101/A4 + S208/171

Course content

Search Engines, Spelling Correction, automatic Question Answering, Translation – the Web is both application area and valuable resource for many useful, everyday applications. This lecture will present Natural Language Processing (NLP) methods to automatically explore the World Wide Web, perform Web Mining and gain insights into open research problems. In our practice sessions, we introduce state-of-the-art NLP toolkits and work on functional NLP projects.

Processing of unstructured web content

  • Introduction
  • NLP Basics – Tokenisation, Part of Speech Tagging, Chunking, Stemming, Lemmatization, Semantic and Syntactic Analysis
  • Web contents and their characteristics, Web Genre Identification

NLP applications for the web

  • Information retrieval – introduction to the basics of information retrieval
  • Web information retrieval – natural language interfaces for web information retrieval
  • Crowdsourcing
  • Argument Mining
  • Question answering (QA): Factoid QA, Knowledge Base QA, Community QA


  • Daniel Jurafsky, James H. Martin, Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 3nd edition, 2019
  • Jacob Eisenstein, Introduction to Natural Language Processing, 2018
  • W. Bruce Croft, Donald Metzler, Trevor Strohman, Search Engines. Information Retrieval in Practice, 2015