Searching through noise for pros and cons

TU Darmstadt develops argument search engine for Internet texts

14.09.2018

Structured decision-making support: The research project ”ArgumenText“ in the field of Ubiquitous Knowledge Processing has found a way to filter concrete pro and con arguments on any topic from amongst the noise of the internet.

Dr. Johannes Daxenberger, Dr. Christian Stab and Dr. Tristan Miller (left to right), together with an international research team, are developing new methods for automatic recognition of arguments in large text sources. Picture: Katrin Binner

Googling for the search term ”Nuclear Energy“ on the internet yields approximately 268 million hits: Explanations, definitions, lobbying texts, newspaper articles, anecdotes, conspiracy theories. How can someone, for example an investor, seeking real pro and con arguments regarding nuclear power as a decision-making aid, find what they are looking for? The project ”ArgumenText“ in the field of Fachgebiet Ubiquitous Knowledge Processing (UKP) of the Department of Computer Science of TU Darmstadt is aimed at filtering out concrete arguments from voluminous and heterogeneous masses of text.

Recently, a demo of the search system came on the scene which has already proven its worth at trade fairs. For example, anyone who researches the subject of ”Nuclear Energy“ will, after a few seconds, see just under a hundred arguments for and against nuclear power – from a variety of Internet sites. The better CO2 balance and the efficiency of atomic energy generation are listed here, along with the toxicity and hazardous nature of the substances used and the long periods during which they exude radioactive waste into their surroundings. The respective sources are linked.

For this purpose, texts available on the internet are examined by means of neural networks, classified as relevant or not relevant to the search topic, and then tapped for arguments. ”Not only are individual words searched, but grammatical structures, con-texts and semantics are examined to decide if a statement is an ’argument‘ or not and whether it is on the pro or con side,“ explains Dr. Johannes Daxenberger, who works in the team of Professor Iryna Gurevych as one of the two project managers at ArgumenText.

The algorithms behind ArgumenText are under development by the team in the field itself, building on initial experiments that started in 2014 with a body of student essays. ”The challenge was to make a system trained on a specific type of text transferable to any kind of text,“ says second project manager Dr. Christian Stab.” Argumentation in scientific texts, for example, is completely different than in social media.“ The team operationalized various models of argumentation theory and taught computer systems to use these models. To optimize the algorithms, the team employed to a powerful computer network; a smaller, more powerful computer network that can efficiently index internetbased texts is now used for ongoing operation.

Public demonstrator

The demonstrator is stable and has recently become publicly available. The project is thus entering the next phase, which will specifically test which applications are particularly promising for the new technology. The main target groups are decision-makers from the business world who must assess whether the use of an innovation is worthwhile, and journalists who must quickly and dependably make their way to the core of a subject in the framework of a search, says Daxenberger. ”We think that the system could be used profitably in these areas.“

For validation purposes, the participating scientists are currently preparing the method for use with German-language texts as well. Now, ArgumenText speaks only English, works with a text corpus from the year 2016 and works best with technical queries. This will soon change. It will also be possible to search in real time in the ever-growing number of texts on the internet.

Currently, the algorithm sorts statements by how reliably they can serve as arguments. Scientists are working for aggregation of the arguments toward users, presenting them according to themes. ”This is obvious from an application perspective, but certainly not trivial from a technical point of view,“ says Stab. Argumentmining, the recognition of linguistic arguments by means of computer science, is becoming ever more important and visible, say Daxenberger and Stab, in the research of Digital Humanities. The TU was active in this area early on. ”Our working group has well and visibly established the TU in the field of argumentmining.“ says Professor Iryna Gurevych, head of the UKP. For this purpose, the interdisciplinary team works with the TU Department of Social and Historical Sciences, as well as with other universities from the network of Rhine-Main universities.

Publication

Stab, Christian and Daxenberger, Johannes and Stahlhut, Chris and Miller, Tristan and Schiller, Benjamin and Tauchmann, Christopher and Eger, Steffen and Gurevych, Iryna: ArgumenText: Searching for Arguments in Heterogeneous Sources. [Online-Edition]

In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: System Demonstrations. [Conference or Workshop Item], 2018, New Orleans, Louisiana

ArgumenText

The project ArgumenText is funded by the Federal Ministry of Education and Research (BMBF) within the framework of the VIP + program under grant number 03VP02540 with 1.5 million euros.

The project is supported by a special offer of the Capitalization of the TU Darmstadt. Several doctoral and research projects are linked to the project. ArgumenText runs from 2017 to 2020. If you’d like to try it, the public demonstrator can be found under www.argumentsearch.com.

Digital Humanities at TU

Under the name ”Digital Humanities“, interdisciplinary collaborations open research-relevant resources in the humanities and cultural sciences using computer-aided methods and make them available digitally. The TU Darmstadt places a significant focus here. Thus, the field of Ubiquitous Knowledge Processing is part of the CEDIFOR (Center for Digital Research in the Humanities, Social and Educational Sciences)..

The Center helps bridge the gap between humanities research questions and computer-based methods. CEDIFOR builds on the experience, expertise and infrastructure of the LOEWE Digital Humanities focus, in which the TU Darmstadt was also centrally involved.

Silke Paradowski / jb