The UKP lab is active in many research projects funded by various agencies such as the German Research Foundation, the Federal Ministry of Education and Research, the Hessian Ministry of Higher Education, Research, Science and the Arts, and the European Union.
“Adaptive Information Preparation from Heterogeneous Sources”, DFG GRK 1994 AIPHES develops new methods to deal with information overload by summarizing multiple documents to a condensed summary. We develop adaptive methods to create summaries of any type from multiple sources and across different genres. To do so, we combine different methodological backgrounds – computational linguistics, computer science, machine learning – to approach the task of extracting, summarizing and evaluating textual information from different sources.
In order to make informed decisions, appropriate arguments are needed. However, the mere amount of information and the complexity of many questions frequently prevents us from finding all arguments that are relevant for a reasonable decision. Within the “Decision support by means of automatically extracting natural language arguments from big data” (ArgumenText) project, the UKP Lab develops novel Argument Mining methods for extracting arguments from large and heterogeneous text sources in order to facilitate decision making processes. In response to a user-defined search query, neural networks determine relevant arguments in realtime and summarize them in a comprehensive way. In contrast to conventional systems, an argumentative information system can show the reasons for or against a decision.
The use of AI becomes more common in security applications. But the security of the applied algorithms is often limited – there are, for example, shortcomings of trained algorithms regarding targeted attacks and risks of privacy loss. The research area SenPAI in ATHENE addresses the subject of security in AI regarding algorithms and systems as well as applications based on ML in the field of cybersecurity.
The CDR-CAT project develops methods for the detection and automatic assessment of the dimensions of Corporate Digital Responsibility (CDR) for providers of digital products or services (especially SMEs). CDR extends the principles of corporate social responsibility (CSR) to include assessment criteria and recommendations for action that respond explicitly to advancing digitization and the associated increased requirements for responsible data handling.
CEDIFOR (Centre for the Digital Foundation of Research in the Humanities, Social, and Educational Sciences) is a Digital Humanities Centre, established in 2014. We intend to contribute to bridging the gap between research in the Humanities and computer based methods, and help researchers to master the characteristic problems in this process. We provide methodological expertise for advising researchers from the Humanities, Social, and Educational Sciences on adopting computer based methods in their research.
In 2050, roughly two-thirds of the world population are expected to live in urban areas. The sustainable growth in number and size of cities is only possible due to gains in efficiency in (critical) infrastructures such as energy, transportation, logistics, and water.
Dictionaries are an essential resource in many domains of research, education, and natural language processing (NLP). One crucial part of dictionaries are example sentences which illustrate real-world use cases of a lemma. However, finding good example sentences in large corpora imposes a heavy workload on lexicographers. In this project, we develop a novel system which eases the work of lexicographers by interactively assessing the goodness and diversity of dictionary examples.
Diagnostic reasoning is a key competence in many professions. The interdisciplinary FAMULUS project aims to study how online case simulations that provide automatic adaptive feedback can foster students' diagnostic skills.
Towards an Infrastructure for the Distributed Exploration and Annotation of Large Corpora and Knowledge Bases
The DIP project – an international cooperation with Bar-Ilan University and Israel Institute of Technology – aims at the next big step in information access technology. The goal is to support users in identifying and assimilating the large set of relevant statements found within multitudes of documents which are usually retrieved by the current search technologies. Novel methods for statement extraction, information consolidation, and inferring relations represent the core research areas within this project.
This project aims at investigating computational methods that continuously improve their capability to recognize arguments in ongoing debates, align incomplete arguments with previous arguments and enrich them with automatically acquired background knowledge, and constantly extend semantic knowledge bases with information required to understand arguments.
Peer review is the core of the modern academic quality control. Reviewing scientific manuscripts requires effort and expertise, and growing publication rates across research fields make traditional essay-based modes of peer reviewing hard to sustain. The PEER project investigates document-centered and machine-assisted alternatives to traditional peer review.
In the age of life-long learning, the amount of educational data provided by expert services and community-based question-and-answer (QA) pages on the Web is growing fast. Although these pages provide useful information, benefiting from them is not always easy for the users. They need to go through various educational information services and query each of them individually, which entails a lot of effort on their side. Automatically analyzing such information on the Web will help the users to access the required pieces of information with minimal effort.
The number of published scientific articles has grown exponentially in the last few decades. As a result, it is impossible for researchers to benefit from all published articles and it has become increasingly difficult to explore all relevant information to their research. NLP is a key technology to aid researchers in dealing with the information overload. In this project, we investigate several foundational technologies for Question Answering (QA) for scientific information over different types of data such as tables and texts.
In this research project (Sci-Arg) we will focus on developing Argument Mining (AM) techniques from an information-seeking perspective to detect relevant arguments in large-scale scientific literature.
Large scale health-related crises such as the ongoing Covid-19 pandemic spread fear and uncertainties across society, providing fertile soil for fake news and conspiracy theories. The current "infodemic" shows how social networks further amplify such misinformation. Changing policies, new scientific discoveries, and constantly evolving misinformation and conspiracy theories pose a severe problem to manual fact-checking.
Automatic question answering (QA) facilitates extraction and identification of relevant knowledge in large data sources that would otherwise be hard to find for humans. Furthermore, numerous other natural language processing (NLP) tasks can be formulated within the scope of a QA framework, which positions QA as one of the most prominent NLP tasks. Due to the rapid progress in the field, the researchers are confronted with situations where state-of-the-art models are outdated just a few months after they have been published. In this project we aim to provide researchers with an extensible QA platform to explore, compare and combine state-of-the-art QA approaches, as well as to aid developing novel approaches by standardizing access to available data and model sources.
Production-driven solutions, managed via a shop floor management system require a high amount of effort to properly document processes during manufacturing and may substantially profit from AI-driven solutions. This project investigates methods to automatically extract and utilize relevant information from unstructured texts like chatlogs to alleviate error workflows in a factory environment.
We bring current hot topics, such as fake news, digitalization, or content analytics into schools. With our school-oriented, original computer science workshop we address the 6th and 7th grade students (aged 11-12). We give them an introduction to Content Analytics and programming in a creative and playful setting.