OpenMinTeD – New EU project for text & data mining infrastructure



Recent years witness an upsurge in the quantities of digital research data, offering new insights and opportunities for improved understanding. Text and data mining is emerging as a powerful tool for harnessing the power of structured and unstructured content and data, by analysing them at multiple levels and in several dimensions to discover hidden and new knowledge. Text mining solutions, however, are not easy to discover and use, nor are they easily combinable by end users.

OpenMinTeD aspires to enable the creation of an infrastructure that fosters and facilitates the discovery and use of text mining technologies and interoperable services. It examines several use cases identified by experts from different scientific areas, ranging from generic scholarly communication to literature related to life sciences, food and agriculture, and social sciences and humanities.

OpenMinTeD text mining tools, services and associated resources will run on the cloud, requiring an in-depth optimization of service deployment and execution via scalable VM-based service distribution and use of distributed storage.

The project runs 36 months from Jun. 2015 – May 2018.


Through its infrastructural foresight activities, OpenMinTeD’s vision is to make operational a virtuous cycle in which:

  • primary content is accessible through standardised programmatic interfaces and access rules,
  • by well-documented and easily discoverable text mining services and workflows which process, analyse and annotate text to
  • identify patterns and extract new meaningful actionable knowledge, which will be used for
  • structuring, indexing and searching content, and, in tandem, e) act as a new knowledge resource useful for drawing new relations between content items and firing a new mining cycle.

UKP Lab leads the WP 5 “Interoperability framework”, the task 5.2 “Infrastructure interoperability specifications” as well as the use-case task 9.4 “Social Sciences” and is further involved in WP 6 “Platform design”, and WP 7 “Platform integration”

Target groups

  • End users who will consume TM services
  • Researchers, data base curators, …
  • Novice: use services to advance their science
  • Advanced: include TM services into more complex research workflows (SMEs).
  • Content and service providers that will provide their content and/or TM services for consumption
  • Publishers, libraries, scientific dbs, …
  • TM research communities
  • SMEs


Funded by the EC under the H2020 Framework Programme for Research and Innovation.

Grant Agreement No. 654021, H2020-EINFRA-2014-2.