Over a period of five years, the InterText research project will develop AI methods that enable the processing and analysis of texts and their relationships to each other. Relationships between the texts can for example be contradictions, implicit references or comments. In the age of information overload, this new technology is intended to provide users with a simple summary of complex information on a specific topic, for example, checking for misinformation.
Natural Language Processing
Natural Language Processing (NLP) is a field of AI research that deals with language and text. While the field started with rule-based approaches such as the context-free grammars created by Noam Chomsky, in the modern era, Deep Learning has revolutionised NLP and resulted in neural networks and machine learning being primarily used to solve problems. New network architectures in conjunction with greater computational power have resulted in previously unimagined progress in performance. This is readily visible in applications such as machine translation or question answering.
Navigating the jungle of information
The Internet offers a large amount of often contradictory information on almost every topic. This information can be found mostly in text sources. In order to get a comprehensive picture of a complex topic, users often have to cross-reference different sources, and there may be a number of relationships between such texts. For instance, several texts might convey the same basic message. However, it is equally possible that the information in two texts cannot be reconciled, or that one text even explicitly contradicts the other.
While current NLP systems are already good at answering simple factual questions, they fail to identify such complex references. One of the reasons for this is that current NLP research is mainly concerned with processing and analysing standalone, short texts and ignores the relationships between these texts.
Professor Gurevych's InterText project aims to close this gap. The work focuses on three widespread phenomena in which texts are connected in different ways: First, the relationship between a document and the comments in that document, as we can see in word processing tools or PDF documents. Second, the relationship between two longer texts, such as a blog entry about a newspaper article or a peer review of a scientific paper. And lastly, the relationships between different versions of the same text.
To achieve this, one of InterText’s goals is to create new datasets and develop methods based on the Deep Learning model known as the Transformer. The new methods can consider not just the actual texts, but also their structure, for example, the titles, headings, sections, etc. These new methods will be applied in two studies on real users. These studies provide the research team with valuable feedback that can be used to optimise the systems.
About Prof. Iryna Gurevych
Iryna Gurevych is the first-ever LOEWE Distinguished Chair in the state of Hesse, Vice President of the Association for Computational Linguistics (ACL), a founding member of the , as well as a W3 professor at the Department of Computer Science and founder and director of the Ubiquitous Knowledge Processing Lab at TU Darmstadt. She received her doctorate in computational linguistics from the University of Duisburg-Essen in 2001 and worked as a postdoctoral researcher at the European Media Lab in Heidelberg. In 2006, she began working as a group leader and assistant professor at TU Darmstadt, becoming a W3 professor in 2009. Her work has earned her an Emmy Noether Junior Research Group and a Lichtenberg Professorship. Iryna Gurevych is an ACL 2020 Fellow (< 0.2 percent of the scientific community) and an ELLIS (the European Laboratory for Learning and Intelligent Systems) Fellow. Hessian Center for Artificial Intelligence (hessian.AI)
About ERC Advanced Grants
are awarded by the European Research Council to researchers from all disciplines. The target group for these grants are researchers with an outstanding scientific track record in their field. In the current round, 253 grants were awarded and 1735 applications were submitted. In addition to Professor Gurevych, Professor Ahmad-Reza Sadeghi, also from TU Darmstadt, was awarded an ERC Advanced Grant. ERC Advanced Grants