Information Consolidation

Open Mining INfrastructure for TExt and Data (OpenMinTeD)


Recent years witness an upsurge in the quantities of digital research data, offering new insights and opportunities for improved understanding. Text and data mining is emerging as a powerful tool for harnessing the power of structured and unstructured content and data, by analysing them at multiple levels and in several dimensions to discover hidden and new knowledge. Text mining solutions, however, are not easy to discover and use, nor are they easily combinable by end users.

OpenMinTeD aspires to enable the creation of an infrastructure that fosters and facilitates the discovery and use of text mining technologies and interoperable services. It examines several use cases identified by experts from different scientific areas, ranging from generic scholarly communication to literature related to life sciences, food and agriculture, and social sciences and humanities.

OpenMinTeD text mining tools, services and associated resources will run on the cloud, requiring an in-depth optimization of service deployment and execution via scalable VM-based service distribution and use of distributed storage.

The project runs 36 months from June 2015 to May 2018.


Through its infrastructural foresight activities, OpenMinTeD’s vision is to make operational a virtuous cycle in which:

  • primary content is accessible through standardised programmatic interfaces and access rules,
  • by well-documented and easily discoverable text mining services and workflows which process, analyse and annotate text to
  • identify patterns and extract new meaningful actionable knowledge, which will be used for
  • structuring, indexing and searching content, and, in tandem, e) act as a new knowledge resource useful for drawing new relations between content items and firing a new mining cycle.

UKP Lab leads the WP 5 “Interoperability framework”, the task 5.2 “Infrastructure interoperability specifications” as well as the use-case task 9.4 “Social Sciences” and is further involved in WP 6 “Platform design”, and WP 7 “Platform integration”

Target groups

  • End users who will consume TM services
  • Researchers, data base curators, …
  • Novice: use services to advance their science
  • Advanced: include TM services into more complex research workflows (SMEs).
  • Content and service providers that will provide their content and/or TM services for consumption
  • Publishers, libraries, scientific dbs, …
  • TM research communities
  • SMEs


  • Prof. Dr. Iryna Gurevych
  • Dr. Richard Eckart de Castilho
  • Masoud Kiaeeha