Guiding Theme C1: Motif Analysis of Text-Based Graphs
Guiding theme C1 explores motif-based approaches on text-based networks. Motifs are small induced subgraphs in large networks. The motif signature of a network with respect to a selection of motifs reflects semantic characteristics of the modeled phenomena. Therefore, motif signatures are a general-purpose technique to characterize networks, to distinguish between networks with different characteristics, and to achieve a deeper understanding of the modeled phenomena.
Research results of the first Ph.D. cohort
Previous work indicated that these methods can be quite powerful in natural language processing tasks (Mesgar and Strube, 2015; Biemann et al., 2012). In particular, we used graph motifs to analyze text quality, and we explored the advantages of motifs in other textual domains, such as collaborative writing communities or political speeches. Our most important methodical contributions included meta motifs (motifs of motifs) and motifs that change over time (temporal motifs). These results have shown that graph motifs can be used to assess the quality of texts, and also give insights about the understanding of text quality, which are important points for summarization. Our graph-based methods can be adapted to a variety of different settings and data types. As a showcase, we incorporated semantic frame embeddings as developed by guiding theme C3. In two experimental settings, we explored graphs and graph motifs based on semantic frames in different settings. We also investigated graph motifs in networks that are not built from text itself, but from the interaction of users that collaborate on texts. Another work analyzed patterns in the user behavior of online writing communities that have beneficial or detrimental effects on the quality of a community and the texts they produce (Arnold et al., 2017).
Ongoing project of the 2nd Ph.D. cohort
The scope and goals of the second phase will differ substantially from that of the first phase, and will shift from text understanding/reception to text generation. However, measuring text quality (in the generated texts) will remain a core focus.
In particular, we will apply deep learning methods to text generation tasks. These have achieved highly promising results on text generation tasks such as summarization and machine translation. However, the generated texts still have several shortcomings, such as containing truncated or repetitive words, meaningless or discourse inconsistent content and ungrammatical sentences. We will provide principled methods to address these issues, and aim to apply them in particular to summarization, machine translation, dialogue response generation and other text generation tasks. A particular focus will be to learn from several tasks jointly, possibly across different languages, to ensure these goals.
In addition, due to the complexity of natural languages, manual evaluation is the most reliable evaluation scenario today, but it is too expensive and time-consuming. Therefore, researchers mostly use automatic evaluation metrics. ROUGE is the accepted standard for automatic evaluation because of its simplicity and high correlation with human judgments in terms of hard lexical matching. However, it fails to account for paraphrasing, linguistic quality (lexical/grammatical completeness and correctness, the level of difficulty of text), and coverage with respect to the original text. Furthermore, having too few gold references (i.e., desired outputs for any given input) will also bias the estimates of quality of text generation systems. The second phase will provide new suitable evaluation metrics, particularly addressing the problems of paraphrasing and linguistic quality assessment in the context of few gold references, and even without any references at all.
This guiding theme will benefit from results and models generated in other guiding themes, particularly, those of B1 (summaries) and B3 (paraphrasing), but the setup considered in this guiding theme is more general, being targeted at general text generation tasks. D2 is also highly relevant, because it likewise addresses the task of quality assessment of generated text.
- PI (Second Cohort): Dr. Steffen Eger
- PI (First Cohort): Prof. Dr. Karsten Weihe
- First Cohort PhD student: Thomas Arnold
- Second Cohort PhD student: Wei Zhao
- Chris Biemann, Stefanie Roos, and Karsten Weihe. Quantifying semantics using complex network analysis. In Proceedings of the 24th International Conference on Computational Linguistics (COLING), Mumbai, India, 2012.
- Mohsen Mesgar and Michael Strube. Graph-based coherence modeling for assessing readability. In Proceedings of the 4th Joint Conference on Lexical and Computational Semantics (* SEM), pages 309–318, 2015.
Tauchmann, Christopher ; Arnold, Thomas ; Hanselowski, Andreas ; Meyer, Christian M. ; Mieskes, Margot (2018):
Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data.
In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), European Language Resources Association, Miyazaki, Japan, [Online-Edition: http://www.lrec-conf.org/proceedings/lrec2018/summaries/252....],
Arnold, Thomas ; Daxenberger, Johannes ; Weihe, Karsten ; Gurevych, Iryna (2017):
Is Interaction More Important Than Individual Performance? A Study of Motifs in Wikia.
In: Proceedings of the 26th International Conference Companion on World Wide Web, International World Wide Web Conferences Steering Committee, Perth, Australia, In: WWW '17 Companion, [Online-Edition: http://dl.acm.org/citation.cfm?id=3041021.3053362],
Arnold, Thomas ; Weihe, Karsten (2016):
Network Motifs May Improve Quality Assessment of Text Documents.
In: Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing, [Online-Edition: http://www.aclweb.org/anthology/W16-1404],