Text as a Process

Text as a Process

Linguistic properties of collaboratively created texts in Web 2.0


Web 2.0 allows novel ways of collaboratively creating textual content. The ease of publication yields phenomena such as multiple authorship, editing and reuse of text snippets, as well as the merging the roles of author and reader as a common standard rather than an exception. In this context, Wikipedia is a unique corpus for linguistic research. Not only because of its size and the huge amount of (mostly anonymous) authors this resource is one of the most valuable assets for researchers of contemporary corpora. It furthermore provides a full edit history and discussion pages for most of its articles, which let the researchers investigate otherwise unobservable processes, i.e. text production, reception and collaboration.


  • Gaining insights into collaboration, production and reception processes
  • Establishing text quality measures
  • Analyzing the change of collaboratively created and edited texts over time
  • Uncovering and visualizing hidden relations by linking Wikipedia article discussions with the corresponding article content
  • Identifying the roles of users in the collaborative writing processes
  • Identifying successful collaboration patterns
  • Developing methods for analyzing dynamic, contemporary corpora


  • Extraction of Wikipedia article revisions and discussion pages using JWPL
  • Modelling of the article revisions and the dialogue acts in discussion pages
  • Discourse segmentation and discourse analysis of the articles' discussion pages
  • Extraction of features for automatic classification with machine learning techniques
  • Cross-lingual and cross-domain comparison of analysis results


  • Oliver Ferschke, M.A., Doctoral Researcher
  • Johannes Daxenberger, M.A., Doctoral Researcher
  • Prof. Dr. Iryna Gurevych, Principal Investigator
  • Dr. Torsten Zesch, Principal Investigator


The LOEWE Research Center “Digital Humanities” is funded by the Hessian excellence program “Landes-Offensive zur Entwicklung Wissenschaftlich-ökonomischer Exzellenz” (LOEWE).