UBY-LMF – A Comprehensive Instantiation Of ISO-LMF

All resources in UBY are represented according to a uniform and comprehensive LMF lexicon model, UBY-LMF. UBY-LMF captures lexical information at a fine-grained level by employing a large number of Data Categories from ISOcat. UBY-LMF, and thus also UBY, is designed to be directly extensible by new resources and languages.

UBY-LMF enables structural and semantic interoperability (with respect to linguistic terminology) across resources down to a fine-grained level of semantic and syntactic information by employing a large number of Data Categories from ISOcat (see also public Data Category Selection Uby 2012).

XML Serialization of UBY-LMF

The current DTD for the XML serialization of UBY-LMF can be found on GitHub.

The UBY-LMF DTD contains references to ISOcat Data Categories. Note that many attributes or attribute values in UBY-LMF link to ISOcat Data Categories with a different (so-called admitted) name. This is explicitly supported by ISOcat, because each Data Category definition may optionally contain Data Element Name Sections in order to record other names for the DC as used in different sources, such as a given database, format or application.

Java Implementation of UBY-LMF

The primary implementation of the UBY-LMF lexicon model is in Java, see current version on GitHub.