QA-EduInf

Community-based Question Answering for Educational Information

Motivation

In the age of life-long learning, the amount of educational data provided by expert services and community-based question-and-answer (QA) pages on the Web is growing fast. Although these pages provide useful information, benefiting from them is not always easy for the users. They need to go through various educational information services and query each of them individually, which entails a lot of effort on their side. They also need to figure out which of the available web pages and services is reliable and provides high-quality information. Automatically analyzing such information on the Web will help the users to access the required pieces of information with minimal effort. To this aim, we create an automatic question answering system which searches through the available educational information sources to answer the users' questions.

Goals

The basic goal of this project is to answer user questions on various educational topics. For instance “How can I do voluntary service abroad?” or “Where can I get information on studying Mathematics?” Since a large portion of users' questions have already been asked by other people and answered by experts or crowds, we use the available question and answer archives to answer these questions. The resulting system will have an interface that takes natural language questions, retrieves the requested information from various heterogeneous information sources from the web, and efficiently presents it to the user as a filtered, summarized, and quality-assessed answer. In case the system is not able to retrieve an answer, it could automatically post the question to the various community-based QA sites used in the project so that the crowd will provide an answer later on.

Another goal of the project is to investigate the impact of semantic information on community-based QA. A Semantic Role Labeling processor converts source texts into a shallow semantic representation, which provides a useful level of abstraction for other QA tasks. The system is developed for German and English, which assures that the principles and decisions used in the system can be transferred to other languages as well.

Methods

1 – Interface for natural language questions

The interface allows users to ask questions and specify the educational information sources that should be used to retrieve answers. In addition, they can define critera that should be used to assess the answer relevance.

2 – Text processing and retrieval

This part of the project collects and analyzes all pairs of question and answers that are available in FAQ collections and social QA forums. This data is then used for the online search. After a user issues a new query (i.e., a question), the system uses paraphrase recognition and information retrieval techniques to find similar questions in the collected data that have already been answered. Answers of the retrieved questions are considered as potential answers to the newly posed user question, which are further assessed with answer re-ranking methods.

3 – Quality assessment

Answer quality assessment is important in different regards, e.g., we can determine the textual quality of an answer, whether or not the answer contains offensive language, if the answer text is subjective or objective, etc. This information is then used to re-rank or filter a set of answers according to a set of chosen quality criteria.

4 – Answer summarization

If the system retrieves different relevant answers for a user question, some may contain complementary information, and others may be redundant. The system thus uses multi-document summarization techniques to summarize the most important information of the retrieved answers in regard to the user question.

5 – Semantic Role Labeling

One of the project's objectives is to evaluate the impact of semantic information on community-based question answering. Semantic Role Labeling is the task of automatically inferring shallow semantic interpretations, that describe input texts in terms of events and their participants (e.g. Who did what to whom). This information is then used to improve the quality of question categorization, answer retrieval, and summarization. There exist several theoretical frameworks for Semantic Role Labeling with different description granularity level. The choice of granularity is a trade-off between semantic richness and processing quality, and it is important to investigate which representation suits the question answering task best. Adapting existing SRL systems to the non-standard language often used in Q&A poses an additional challenge.

Team

  • Prof. Dr. Iryna Gurevych, Principal Investigator
  • Nafise Moosavi, Postdoctoral Researcher
  • Andreas Rücklé, Doctoral Researcher
  • Ilia Kuznetsov, Doctoral Researcher
  • Daniil Sorokin, Doctoral Researcher
  • Yang Gao, Postoctoral Researcher

Former Staff:

  • Teresa Botschen, Doctoral Researcher (associated)
  • Dr. Saeedeh Momtazi, Postdoctoral Researcher (associated, on leave)
  • Silvana Hartmann, Doctoral Researcher
  • Yan Shao, Project Staff

Funding

This project is funded by Deutsche Forschungsgemeinschaft (German Research Foundation). Our report on the first funding phase of the research grant is available online: QA_EduInf_Report.pdf

Publications

Böhm, Florian ; Gao, Yang ; Meyer, Christian M. ; Shapira, Ori ; Dagan, Ido ; Gurevych, Iryna (2019):
Better Rewards Yield Better Summaries: Learning to Summarise Without References.
In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), In: The 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, China, 03.11.2019--07.11.2019, [Online-Edition: https://public.ukp.informatik.tu-darmstadt.de/UKP_Webpage/pu...],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Moosavi, Nafise Sadat ; Gurevych, Iryna (2019):
Neural Duplicate Question Detection without Labeled Training Data.
In: The 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, 03.11.2019-07.11.2019, [Online-Edition: https://public.ukp.informatik.tu-darmstadt.de/UKP_Webpage/pu...],
[Konferenzveröffentlichung]

Simpson, Edwin ; Gurevych, Iryna (2019):
A Bayesian Approach for Sequence Tagging with Crowds.
In: The 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, China, 03.12.2019-07.12.2019, [Online-Edition: https://public.ukp.informatik.tu-darmstadt.de/UKP_Webpage/pu...],
[Konferenzveröffentlichung]

Gao, Yang ; Eger, Steffen ; Kuznetsov, Ilia ; Gurevych, Iryna ; Miyao, Yusuke (2019):
Does My Rebuttal Matter? Insights from a Major NLP Conference.
Minneapolis, USA, In: The 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, USA, 02.06.2019-07.10.2019, [Online-Edition: https://www.aclweb.org/anthology/N19-1129],
[Konferenzveröffentlichung]

Gao, Yang ; Meyer, Christian M. ; Mesgar, Mohsen ; Gurevych, Iryna (2019):
Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation.
In: Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), In: The 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), Macao, China, 10.08.2019--16.08.2019, [Online-Edition: https://www.ijcai.org/proceedings/2019/0326.pdf],
[Konferenzveröffentlichung]

Moosavi, Nafise Sadat ; Born, Leo ; Poesio, Massimo ; Strube, Michael (2019):
Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection.
In: The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, 28.07.2019-02.08.2019, [Online-Edition: https://www.aclweb.org/anthology/P19-1408],
[Konferenzveröffentlichung]

Simpson, Edwin ; Do Dinh, Erik-Lân ; Miller, Tristan ; Gurevych, Iryna (2019):
Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning.
In: The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019), Florence, Italy, 28.07.2019-02.08.2019, [Online-Edition: https://www.aclweb.org/anthology/P19-1572],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Swarnkar, Krishnkant ; Gurevych, Iryna (2019):
Improved Cross-Lingual Question Retrieval for Community Question Answering.
In: Proceedings of The Web Conference (WWW-19), San Francisco, USA, San Francisco, USA, [Online-Edition: https://dl.acm.org/citation.cfm?id=3313502],
[Konferenzveröffentlichung]

Eger, Steffen ; Şahin, Gözde Gül ; Rücklé, Andreas ; Lee, Ji-Ung ; Schulz, Claudia ; Mesgar, Mohsen ; Swarnkar, Krishnkant ; Simpson, Edwin ; Gurevych, Iryna (2019):
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems.
In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, USA, In: The 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, USA, 02.06.2019--07.10.2019, [Online-Edition: https://www.aclweb.org/anthology/N19-1165],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Moosavi, Nafise Sadat ; Gurevych, Iryna (2019):
COALA: A Neural Coverage-Based Approach for Long Answer Selection with Small Data.
In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, USA, In: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, USA, DOI: 10.1609/aaai.v33i01.33016932,
[Online-Edition: https://aaai.org/ojs/index.php/AAAI/article/view/4671/4549],
[Konferenzveröffentlichung]

Kuznetsov, Ilia ; Gurevych, Iryna (2018):
Corpus-driven Thematic Hierarchy Induction.
In: Proceedings of the 22nd Conference on Computational Natural Language Learning, Association for Computational Linguistics, In: Proceedings of the 22nd Conference on Computational Natural Language Learning, Brussels, Belgium, October 31 - November 1, 2018, [Online-Edition: http://aclweb.org/anthology/K18-1006],
[Konferenzveröffentlichung]

Eger, Steffen ; Rücklé, Andreas ; Gurevych, Iryna (2018):
PD3: Better Low-Resource Cross-Lingual Transfer By Combining Direct Transfer and Annotation Projection.
In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, In: 5th Workshop on Argument Mining, Brussels, Belgium, 31.10.2018, [Online-Edition: https://fileserver.ukp.informatik.tu-darmstadt.de/UKP_Webpag...],
[Konferenzveröffentlichung]

Sorokin, Daniil ; Gurevych, Iryna (2018):
Interactive Instance-based Evaluation of Knowledge Base Question Answering.
In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), In: The 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31.10.2018--04.11.2018, System Demonstrations, [Online-Edition: https://aclweb.org/anthology/D18-2020.pdf],
[Konferenzveröffentlichung]

Kuznetsov, Ilia ; Gurevych, Iryna (2018):
From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources.
In: Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), Association for Computational Linguistics, In: The 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, USA, 20.08.2018--26.08.2018, [Online-Edition: http://aclweb.org/anthology/C18-1020],
[Konferenzveröffentlichung]

Sorokin, Daniil ; Gurevych, Iryna (2018):
Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering.
In: Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), In: The 27th International Conference on Computational Linguistics (COLING 2018), Santa Fe, USA, 20.08.2018--26.08.2018, [Online-Edition: http://aclweb.org/anthology/C18-1280],
[Konferenzveröffentlichung]

Momtazi, Saeedeh ; Gurevych, Iryna (2018):
Unsupervised Latent Dirichlet Allocation for supervised question classification.
In: Information Processing & Management, S. 380-393, 54, (3), ISSN 0306-4573,
DOI: 10.1016/j.ipm.2018.11.007,
[Online-Edition: https://www.sciencedirect.com/science/article/pii/S030645731...],
[Article]

Sorokin, Daniil ; Gurevych, Iryna (2018):
Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories.
In: Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (*SEM), Stroudsburg PA, USA, In: The Seventh Joint Conference on Lexical and Computational Semantics (*SEM), New Orleans, USA, 05.06.2018--06.06.2018, DOI: 10.18653/v1/S18-2007,
[Online-Edition: http://aclweb.org/anthology/S18-2007],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Eger, Steffen ; Peyrard, Maxime ; Gurevych, Iryna (2018):
Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations.
In: arXiv:1803.01400, [Online-Edition: https://arxiv.org/abs/1803.01400],
[Article]

Sorokin, Daniil ; Gurevych, Iryna (2017):
End-to-end Representation Learning for Question Answering with Weak Supervision.
In: Semantic Web Challenges: 4th SemWebEval Challenge at ESWC 2017, Springer, Cham, Portoroz, Slovenia, In: Communications in Computer and Information Science, 769, [Online-Edition: https://link.springer.com/chapter/10.1007%2F978-3-319-69146-...],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Gurevych, Iryna (2017):
Real-Time News Summarization with Adaptation to Media Attention.
In: Proceedings of the 11th Conference on Recent Advances in Natural Language Processing (RANLP 2017), Association for Computational Linguistics, Varna, Bulgaria, DOI: 10.26615/978-954-452-049-6_079,
[Online-Edition: https://doi.org/10.26615/978-954-452-049-6_079],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Gurevych, Iryna (2017):
Representation Learning for Answer Selection with LSTM-Based Importance Weighting.
In: Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017), Association for Computational Linguistics, Montpellier, France, Volume 2: Short papers, [Online-Edition: http://aclweb.org/anthology/W17-6935],
[Konferenzveröffentlichung]

Sorokin, Daniil ; Gurevych, Iryna (2017):
Context-Aware Representations for Knowledge Base Relation Extraction.
In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Copenhagen, Denmark, [Online-Edition: http://aclweb.org/anthology/D17-1188],
[Konferenzveröffentlichung]

Rücklé, Andreas ; Gurevych, Iryna (2017):
End-to-End Non-Factoid Question Answering with an Interactive Visualization of Neural Attention Weights.
In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations (ACL 2017), Association for Computational Linguistics, Vancouver, Canada, 4: System Demonstrations, [Online-Edition: http://aclweb.org/anthology/P17-4004],
[Konferenzveröffentlichung]

Bugert, Michael ; Puzikov, Yevgeniy ; Rücklé, Andreas ; Eckle-Kohler, Judith ; Martin, Teresa ; Martínez Cámara, Eugenio ; Sorokin, Daniil ; Peyrard, Maxime ; Gurevych, Iryna (2017):
LSDSem 2017: Exploring Data Generation Methods for the Story Cloze Test.
In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics, Association for Computational Linguistics, In: The 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics, Valencia, Spain, 03.04.2017--04.04.2017, ISBN 978-1-945626-40-1,
[Online-Edition: http://aclweb.org/anthology/W17-0908],
[Konferenzveröffentlichung]

Hartmann, Silvana ; Kuznetsov, Ilia ; Martin, Teresa ; Gurevych, Iryna (2017):
Out-of-domain FrameNet Semantic Role Labeling.
In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Association for Computational Linguistics, Valencia, Spain, [Online-Edition: http://aclweb.org/anthology/E17-1045],
[Konferenzveröffentlichung]

Hartmann, Silvana ; Mújdricza-Maydt, Éva ; Kuznetsov, Ilia ; Gurevych, Iryna ; Frank, Anette (2017):
Assessing SRL Frameworks with Automatic Training Data Expansion.
In: Proceedings of the 11th Linguistics Annotation Workshop (LAW XI) at EACL 2017, Association for Computational Linguistics, Valencia, Spain, [Online-Edition: https://aclweb.org/anthology/W/W17/W17-0814.pdf],
[Konferenzveröffentlichung]

Mújdricza-Maydt, Éva ; Hartmann, Silvana ; Gurevych, Iryna ; Frank, Anette (2016):
Combining Semantic Annotation of Word Sense & Semantic Roles: A Novel Annotation Scheme for VerbNet Roles on German Language Data.
In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, [Online-Edition: http://www.lrec-conf.org/proceedings/lrec2016/pdf/1129_Paper...],
[Konferenzveröffentlichung]

go to TU-biblio search on ULB website