Uncovering Plagiarism, Authorship and Social Software Misuse at CLEF 2012

In 2012, UKP participated in the inaugural edition of the Wikipedia Quality Flaw Prediction Task in the PAN Lab at CLEF 2012.

With over 23 million articles in 285 languages, Wikipedia is the largest

free knowledge base on the web. Due to its open nature, everybody is allowed to

access and edit the contents of this huge encyclopedia. As a downside of this

open access policy, quality assessment of the content becomes a critical issue

and is hardly manageable without computational assistance. For PAN 2012, we developed FlawFinder, a modular system for automatically predicting quality flaws

in unseen Wikipedia articles. According to the official results, our system placed first in terms of precision and second in terms of recall and F1-measure.

A detailed description of our system is available in the following paper:

  • Oliver Ferschke and Iryna Gurevych and Marc Rittberger. FlawFinder: A Modular System for Predicting Quality Flaws in Wikipedia – Notebook for PAN at CLEF 2012. In CLEF 2012 Labs and Workshop, Notebook Papers, (to appear), September 2012. Rome, Italy.