Text Generation for Tone of Voice and eCommerce
In the past few decades we have witnessed a fast growth of information content in all types of mass media. As the volume of published data grows, accessing and processing it in the shortest possible time becomes of vital importance. Our research project, “Text Generation for Tone of Voice and eCommerce” (TGTOVE), is concerned with the content generation aspect of this process. Specifically, we will focus on developing Natural Language Generation (NLG) techniques for eCommerce applications. The two use cases being addressed are:
- Re-generating texts with the required stylistic features, while preserving the original contents (change of the tone of voice of a document).
- Generating textual descriptions of life science products from sets of key-value attribute pairs.
The research problem is that of generating natural language utterances from structured data representations. The project is aimed at the development of a Natural Language Generation (NLG) framework which includes:
- A robust automatic tool for generating high-quality natural language statements from structured content representations.
- A sensible evaluation methodology.
- A suite of methods for controlling the output of an NLG system.
We are motivated by the fact that available NLG systems are either restricted to very narrow domains and support a limited amount of languages, or are uncontrollable and unpredictable due to the lack of interpretability of the system's behaviour. This defines the key requirements for the system design: language- and domain-independence and flexibility. The system is envisioned as being data-driven, operating in an end-to-end fashion, but its mechanics should be transparent, allowing easy debugging and extension. We propose an approach which combines powerful machine learning machinery and linguistic knowledge to tackle the problem of generating high-quality natural language utterances from structured representations and delivering the results in a way that facilitates task-oriented improvements on the user side.
Application-wise, our research is not restricted to the targeted use cases. Document summarization, question answering, machine translation, interactive learning -- any field which assumes consolidating and presenting textual information to human users can benefit from the results of our project.
Software Campus program (BMBF)
- Merck KGaA, Darmstadt, Germany
- Prof. Dr. Iryna Gurevych, Mentor
- Yevgeniy Puzikov, Doctoral Researcher
Puzikov, Yevgeniy ; Gardent, Claire ; Dagan, Ido ; Gurevych, Iryna (2019):
Revisiting the Binary Linearization Technique for Surface Realization.
In: The 12th International Conference on Natural Language Generation (INLG 2019), Tokyo, Japan, 29.10.2019--01.11.2019, [Online-Edition: https://public.ukp.informatik.tu-darmstadt.de/UKP_Webpage/pu...],
Puzikov, Yevgeniy ; Gurevych, Iryna (2018):
E2E NLG Challenge: Neural Models vs. Templates.
In: Proceedings of the 11th International Conference on Natural Language Generation (INLG 2018), In: The 11th International Conference on Natural Language Generation (INLG 2018), Tilburg, Netherlands, 05.11.2018--08.11.2018, [Online-Edition: http://aclweb.org/anthology/W18-6557],