The UKP lab is active in many research projects funded by various agencies such as the German Research Foundation, the Federal Ministry of Education and Research, the Hessian Ministry of Higher Education, Research, Science and the Arts, and the European Union.
In the realm of software development, the latest code generation systems emerged as a game-changing technique that aims to revolutionize how we write software. Developers worldwide leverage code generation systems, such as Copilot, to accelerate coding.
The use of AI becomes more common in security applications. But the security of the applied algorithms is often limited – there are, for example, shortcomings of trained algorithms regarding targeted attacks and risks of privacy loss. The research area SenPAI in ATHENE addresses the subject of security in AI regarding algorithms and systems as well as applications based on ML in the field of cybersecurity.
Large scale health-related crises such as the ongoing Covid-19 pandemic spread fear and uncertainties across society, providing fertile soil for fake news and conspiracy theories. The current "infodemic" shows how social networks further amplify such misinformation. Changing policies, new scientific discoveries, and constantly evolving misinformation and conspiracy theories pose a severe problem to manual fact-checking.
Peer review lies at the center of academic quality control. Yet, in many fields of science, researchers pre-publish their article drafts on preprint servers like arxiv or bioarxiv to disseminate their research findings to by-pass the typically lengthy reviewing process. However, the resulting vast body of gray literature, i.e. research articles of unverified quality, poses a great challenge to the day-to-day research work by putting the burden of quality assessment on the reader. This issue becomes particularly pressing in crisis situations like the COVID-19 pandemic where the societal consequences of unfaithful scientific work can be tremendous and the number of published preprint articles within a field may explode.
The CDR-CAT project develops methods for the detection and automatic assessment of the dimensions of Corporate Digital Responsibility (CDR) for providers of digital products or services (especially SMEs). CDR extends the principles of corporate social responsibility (CSR) to include assessment criteria and recommendations for action that respond explicitly to advancing digitization and the associated increased requirements for responsible data handling.
CEDIFOR (Centre for the Digital Foundation of Research in the Humanities, Social, and Educational Sciences) is a Digital Humanities Centre, established in 2014. We intend to contribute to bridging the gap between research in the Humanities and computer based methods, and help researchers to master the characteristic problems in this process. We provide methodological expertise for advising researchers from the Humanities, Social, and Educational Sciences on adopting computer based methods in their research.
In 2050, roughly two-thirds of the world population are expected to live in urban areas. The sustainable growth in number and size of cities is only possible due to gains in efficiency in (critical) infrastructures such as energy, transportation, logistics, and water.
Natural language processing (NLP) fails to support the analysis of fine-grained relationships between texts – intertextual relationships. This is a crucial milestone for AI as it would allow analysing the origin and evolution of texts and ideas, and enable new applications of AI to text-based collaboration, from education to business. Funded by the European Research Council, the InterText project is developing the first-ever framework for exploring intertextuality in NLP. InterText will develop conceptual and applied models and data sets for the study of inline commentary, implicit linking and document versioning. The models will be evaluated in two case studies involving academic peer review and conspiracy theory debunking.
The revolutionary opportunities opened by eXtended Reality (XR) technologies will only materialize if concepts, techniques, and tools are provisioned to ensure the social acceptance of XR systems; by this we mean that the XR system should not be just innovative and functionally complex, but also provide an experience that: satisfies the goals and needs of the user, is in compliance with the social context in which the system is being used, and is transparent, safe, secure, explainable and is trusted by the user.
Dictionaries are an essential resource in many domains of research, education, and natural language processing (NLP). One crucial part of dictionaries are example sentences which illustrate real-world use cases of a lemma. However, finding good example sentences in large corpora imposes a heavy workload on lexicographers. In this project, we develop a novel system which eases the work of lexicographers by interactively assessing the goodness and diversity of dictionary examples.
Towards an Infrastructure for the Distributed Exploration and Annotation of Large Corpora and Knowledge Bases
The formation of opinion on the measures to be taken to combat the COVID 19 pandemic is similar to that in previous crises (e.g., the "refugee crisis" in 2015): Politicians, mass media and citizens quickly reach a consensus on the measures to be taken under the impression of an impending crisis. However, this consensus increasingly dissolves as the crisis progresses. This leads to a polarization of society and makes it much more difficult for those responsible to solve the problem. The aim is to investigate the opinions of these actors on the measures to combat the COVID-19 pandemic in an interdisciplinary collaboration with communication scholars at JGU Mainz.
Peer review is the core of the modern academic quality control. Reviewing scientific manuscripts requires effort and expertise, and growing publication rates across research fields make traditional essay-based modes of peer reviewing hard to sustain. The PEER project investigates document-centered and machine-assisted alternatives to traditional peer review.
The number of published scientific articles has grown exponentially in the last few decades. As a result, it is impossible for researchers to benefit from all published articles and it has become increasingly difficult to explore all relevant information to their research. NLP is a key technology to aid researchers in dealing with the information overload. In this project, we investigate several foundational technologies for Question Answering (QA) for scientific information over different types of data such as tables and texts.
Automatic question answering (QA) facilitates extraction and identification of relevant knowledge in large data sources that would otherwise be hard to find for humans. Furthermore, numerous other natural language processing (NLP) tasks can be formulated within the scope of a QA framework, which positions QA as one of the most prominent NLP tasks. Due to the rapid progress in the field, the researchers are confronted with situations where state-of-the-art models are outdated just a few months after they have been published. In this project we aim to provide researchers with an extensible QA platform to explore, compare and combine state-of-the-art QA approaches, as well as to aid developing novel approaches by standardizing access to available data and model sources.
Production-driven solutions, managed via a shop floor management system require a high amount of effort to properly document processes during manufacturing and may substantially profit from AI-driven solutions. This project investigates methods to automatically extract and utilize relevant information from unstructured texts like chatlogs to alleviate error workflows in a factory environment.
This project aims to privatize texts with formal guarantees, using differential privacy (DP), while simultaneously preserving their utility for research. This would allow the scientific community to analyze texts in any form without breaching the privacy of their creators.