DKPro Core is a collection of software components for natural language processing (NLP) based on the Apache UIMA framework. The provided components wrap a constantly growing set of stand-of-the-art NLP tools and also include several original components covering a wide range of tasks including: tokenization/segmentation, compound splitting, stemming, part-of-speech tagging, lemmatization, constituency parsing, dependency parsing, named entity recognition, coreference resolution, language identification, spelling correction, grammar checking, and support for reading and writing various file and corpus formats.


DKPro Core relies heavily on uimaFIT and is meant to be used with Apache Maven. The main components are hosted on Maven Central, while distributable models are available from the public Maven repository at UKP Lab.

Documentation, source code and further instructions regarding DKPro Core can found on the GitHub Project Site.

The source is code is provided under different licenses, depending on the DKPro Core component:

  • DKPro Core ASL components use the Apache License 2.0
  • DKPro Core GPL components use the GNU General Public License 3.0