Text Analytics

Text Analytics: NLP for Document Processing

This seminar will cover the latest research on document processing for natural language processing.


All information will be distributed via the Moodle eLearning platform.

The first sessions will consist of introductory lectures to cover the basics of machine learning methods used for NLP tasks. The program for the remainder of the seminar will be determined according to the number of participants..

Teaching Staff

  • Soumya Sarkar
  • Prof. Dr. Iryna Gurevych

Course content

Text Analytics: NLP for Document Processing

Natural language processing (NLP) has made considerable progress in the last decade, especially since the advancements of deep neural networks. Several of these technologies have been developed in academic research labs and were applied in real-world applications, powering several tasks like search, recommendation, autosuggestion, etc. Another promising subfield of NLP is document processing (e.g. scholarly articles, Wikipedia articles…). In this seminar, we will explore literature in this domain and look into the following questions:

What are the drawbacks of current NLP technologies when applied to long documents?

What are the document-level problems that can be solved with these approaches?

How can we read academic literature critically and provide useful feedback on unpublished work?


Will be announced during the seminar.


When you should send me a request for the office hour: 2 weeks before your presentation (if you are the first week presenter, you can send it 1 week before)

What you should tell me in your e-mail: (1) Preferred half an hour time-slot if you have any preference; (2) Your name and your paper;

When you should send me your presentation draft: As early as possible, not later than 3 days before our meeting