DKPro Core 1.7.0 Released

2014/11/28 by

We are pleased to announce the release of

DKPro Core, version 1.7.0 (ASL & GPL)

a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

code.google.com/p/dkpro-core-asl

code.google.com/p/dkpro-core-gpl

Analysis components

  • hunpos – wrapper for hunpos, a HMM pos tagger including models for many languages;
  • langdetect – wrapper for language-detection, a language detection tool for java;
  • mallet – wrapper for topic modelling using MALLET;
  • textnormalizer – original components for text normalization, e.g. spelling correction, umlaut normalization, expressive lengthening normalization.

Data formats

  • io.conll – support for CoNLL 2000, 2002, 2009 and 2012 formats;
  • io.ditop – support for DiTop topic model visualization format;
  • io.penntree – support for combined and chunked formats;
  • io.tueppdz – support for TüPP-D/Z format.

Further highlights in this release include:

A more detailed overview of the changes in this release can be found here.

When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects – they may not be compatible with each other.