SeDiTraH: Fake News and Conspiracy Theories
(Funding Period: 2020 - 2024)

Motivation

Large scale health-related crises, such as the ongoing Covid-19 pandemic, spread fear and uncertainties across society, providing fertile soil for fake news and conspiracy theories. The current “infodemic” shows how social networks further amplify such misinformation. Changing policies, new scientific discoveries, and constantly evolving misinformation and conspiracy theories pose a severe problem to manual fact-checking.

Current automatic fact-checking systems lack the ability for time-aware claim validation of such complex claims.

Goals

  • Automatic decomposition of claims into easily verifiable sub-claims.
  • Transparent model decisions by outputting textual rationales for a verdict.
  • A fact-checking system and fine-grained annotations to automatically detect cherry-picked evidence and veracity labels for claims.

Dataset Creation

We create a high-quality dataset with real-world Covid-19 claims by using existing methods to identify check-worthy claims from social media. To identify evidence useful to verify or debunk these claims, we further utilize existing methodologies to find already fact-checked counterparts for these claims.

Method

Current fact-checkers lack the ability for time-aware claim verification, which is especially important in times of crisis, as the veracity of each claim may change quickly. Further, most approaches to automatic fact-checking verify claims by identifying whether a claim is backed up by trustworthy – and hence sparse – evidence. This makes these approaches incapable of identifying misleading claims, which omit relevant information, or of verifying complex claims.

In this project, we investigate solutions to overcome these issues by (a) creating annotations to identify misleading claims based on valid but cherry-picked evidence, and (b) simplifying fact-checking of complex claims by decomposing claims into simpler sub-claims with respect to the available evidence. To deal with the constantly changing ground truth over time, we further (c) annotate claims with respect to different pieces of evidence yielding different veracity labels for a claim. This allows for an evaluation strictly based on available evidence.

First, in the Decomposition step, complex claims are decomposed into isolated statements (subclaims). These subclaims often are connected via relation (such as causal relations). In addition to generating isolated subclaims, this step includes the identification of these relations. Relations may be explicit (left) or implicit (right). In the Prediction step, the system must first identify relevant evidence from a trustworthy knowledge base. Given this evidence, the model predicts for each subclaim and each relation whether it is true / false or unproven. The resulting graph with the predicted veracity serves as the highly interpretable output to the user. This is crucial, especially in the context of conspiracy theories, that often rely on factually undebunkable speculations.

Team

  • Prof. Dr. Iryna Gurevych (Principal Investigator)
  • Luke Bates
  • Max Glockner

Funding

This research work has been funded by the German Federal Ministry of Education and Research and the Hessian Ministry of Higher Education, Research, Science and the Arts within their joint support of the National Research Center for Applied Cybersecurity ATHENE.

Rücklé, Andreas ; Geigle, Gregor ; Glockner, Max ; Beck, Tilman ; Pfeiffer, Jonas ; Reimers, Nils ; Gurevych, Iryna (2021):
AdapterDrop: On the Efficiency of Adapters in Transformers.
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7930-7946,
ACL, The 2021 Conference on Empirical Methods in Natural Language Processing, virtual Conference and Punta Cana, Dominican Republic, 07.-11.11.2021, ISBN 978-1-955917-09-4,
[Conference or Workshop Item]

Arzt, Steven ; Poller, Andreas ; Vallejo, Gisela (2021):
Tracing Contacts With Mobile Phones to Curb the Pandemic:Topics and Stances in People’s Online Comments About the Official German Contact-Tracing App.
In: CHI EA '21: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems,
ACM, 2021 CHI Conference on Human Factors in Computing Systems, virtual Conference, 08.-13.05.2021, ISBN 978-1-4503-8095-9,
DOI: 10.1145/3411763.3451631,
[Conference or Workshop Item]

Reimers, Nils ; Gurevych, Iryna (2021):
The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes.
In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 605-611,
Association for Computational Linguistics, 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021), virtual Conference, 01.-06.08.2021, [Conference or Workshop Item]

Geigle, Gregor ; Reimers, Nils ; Rücklé, Andreas ; Gurevych, Iryna (2021):
TWEAC: Transformer with Extendable QA Agent Classifiers.
In: arXiv-Computer Science, In: Computation and Language, (Preprint), arXiv, [Article]

Thakur, Nandan ; Reimers, Nils ; Daxenberger, Johannes ; Gurevych, Iryna (2021):
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks.
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 296-310,
ACL, 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics, virtual Conference, 06.-11.06.2021, ISBN 978-1-954085-46-6,
[Conference or Workshop Item]