Guiding theme D1: Multi-level models of information quality in online scenarios
Guiding theme D1: Multi-level models of information quality in online scenarios
Guiding theme D1 is concerned with the quality assessment of online sources during the summarization process. When dealing with heterogeneous sources from the web, the reliability of the information is often very diverse. Given a topic, in which a user might be interested in, the retrieved documents for a summary might originate from collaborative websites such as wikis, discussion forums or in online news web-pages, all of which vary with respect to their credibility. It is therefore important, not just to evaluate how a summary reflects important aspects of the underlying text, but also to assess whether the collected information is reliable.
A comprehensive evaluation process of the collected information must address several layers of evaluation. First, the reliability of the domains of the retrieved documents needs to be assessed. Second, it must be determined, whether the retrieved articles are reliable in order to filter out fake news and hyper-partisan content. This is particularly important if the source of the article is unknown. In a third step, the claims made in an article need to be evaluated. In fact, sources, which are generally considered as being reliable, might also contain controversial claims. However, since all these steps are interdependent, as an unreliable source frequently produces unreliable articles with unsubstantiated claims, a multi-level model is best suited for the evaluation of the information. This leading theme therefore focuses on the developed models, which would allow the quality of the gathered information to be assessed before the actual summarization process is carried out.
Example Ph.D. project
This example project is about conducting research on criteria and novel evaluation methods of assessing heterogeneous document sources used for the automatic document summarization. The heterogeneity of documents in the Web, User-generated content and collaborative generated content face new encompassing challenges for the evaluation of quality and trustworthiness of the documents. Existing information quality frameworks provide a basis for the development of a new model related to the scope of AIPHES in general and in particular multi-document summarization. The intended use of the information resp. documents has to be considered defining the criteria. In addition, the model has to be composed of different levels which have to be inspected each.
A Ph.D. project in this area will therefore focus on creating these multi-level model and a comprehensive framework for the evaluation of heterogeneous document sources. The quality assessment framework will be developed in close collaboration with guiding theme D2: Manual and automatic Quality Assessment of Summaries from Heterogeneous Sources. The student will cooperate with the user esp. from online editorial teams. He or she will monitor, document and analyze the editorial processes in different companies and use the experiences of our cooperation partner Institut für Kommunikation und Medien (IKUM) at Hochschule Darmstadt.
Research results of the first Ph.D. cohort
Research conducted for the guiding theme D1 focuses on constructing an NLP pipeline for automated fact-checking. Manual fact-checking is today an important instrument in the fight against false information, however, it cannot solve the problem entirely. The large number of fake news articles being generated at a high rate cannot be easy detected in debunked by human fact-checkers. Automated fact-checking, on the other hand, would allow a large number of articles to be validated in real time as they appear on the web.
Nevertheless, despite significant progress in the field of natural language processing in the past couple of years, a fully automated fact-checking system is not yet feasible. The validation process is very challenging and there are a number of capabilities the system must possess, such as abstract reasoning or world knowledge, which cannot be easily realized with today’s machine learning techniques. Thus, the objective of this project is, in the first place, to develop a system, which is meant to assist a fact-checker in order to speed up the validation process rather than taking over the job entirely. Since there is a human in the loop, the system needs to be transparent, so the reason for a false verdict can be identified and the system subsequently improved.
In order to address the described challenges, we are proposing a comprehensive system for automated fact-checking, which focuses on the validation of controversial claims. In fact, many of the upcoming challenges in fact-checking can be reduced to claim validation. For the reduction of the complexity of the problem, we divide the task into several subproblems and tackle them individually. As a result, also the transparency of the system is increased. This enables the fact-checker to comprehend why a particular verdict for a claim was predicted on the basis of the intermediate outputs of the subsystems. To address the problem of data sparsity of knowledge bases, we are developing a system which extracts its knowledge from web documents. This allows the system to assess the veracity of a claim on a wide range of topics.
In order to solve the claim validation problem, we propose the following pipeline approach (Hanselowski and Gurevych 2017). In the first step, relevant web documents for the resolution of a given claim, as well as the information about their sources, is retrieved. In the second step, evidence, which supports or refute the claims, are identified in the web documents. The stance of the evidence with respect to the claim is determined in the third step. In the fourth step, the actual claim validation is performed. The generated output of all three previous subsystems serves thereby as an input.
For the training of the individual modules, we have developed a richly annotated corpus, which is based on the Snopes fact-checking website. It contains all the required annotations for the individual modules of the pipeline, such as the claims and their ratings, documents containing important information for the resolution of the claims and their sources, evidence extracted from the documents and the stance of the evidence with respect to the claims.
For the document retrieval module, we are using the DrQA system from Chen et al., 2017 and named entity tool proposed by Soroking and Gurevych 2018. For evidence extraction, we have developed a model which was inspired by the MatchPyramid model Liang et al. 2016. For stance detection, we are using a multi layer perception (Hanselowski et al. 2018) which was successful in the Fake News Challenge stance detection task thereby securing the second rank. For claim validation, we have developed a model which is based decomposable attention (Ankur at el. 2016).
- PI: Prof. Dr. Iryna Gurevych
- PhD student(s): Andreas Hanselowski
- Chen Danqi, Adam Fisch, Jason Weston, and Antoine Bordes. Reading wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051 (2017).
- Sorokin Daniil and Iryna Gurevych. Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories. arXiv preprint arXiv:1804.08460 (2018).
- Pang Liang, Yanyan Lan, Jiafeng Guo, Jun Xu, Shengxian Wan, and Xueqi Cheng. Text Matching as Image Recognition. In proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), pp. 2793-2799. 2016.
- Parikh, Ankur P., Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. A decomposable attention model for natural language inference. arXiv preprint arXiv:1606.01933 (2016).
Hanselowski, Andreas ; Zhang, Hao ; Li, Zile ; Sorokin, Daniil ; Schiller, Benjamin ; Schulz, Claudia ; Gurevych, Iryna (2018):
UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification.
EMNLP 2018, FEVER Workshop, EMNLP 2018, Brussels, November 2 – November 4, 2018, [Konferenzveröffentlichung]
Hanselowski, Andreas ; P. V. S., Avinesh ; Schiller, Benjamin ; Caspelherr, Felix ; Chaudhuri, Debanjan ; Meyer, Christian M. ; Gurevych, Iryna (2018):
A Retrospective Analysis of the Fake News Challenge Stance-Detection Task.
In: Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), S. 1859-1874,
The 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, NM, USA, 20.08.2018--26.08.2018, [Konferenzveröffentlichung]
Hanselowski, Andreas ; Gurevych, Iryna (2017):
A Framework for Automated Fact-Checking for Real-Time Validation of Emerging Claims on the Web.
In: NIPS 2017 Workshop on Prioritising Online Content,