Tree-based algorithms for extreme multi-label classification

12.06.2018 10:00-11:00

Tree-based algorithms for extreme multi-label classification

Prof. Krzysztof Dembczyński, Ph. D., Poznan University of Technology, Poland

12.06.2018, 10:00 Uhr – 11:00 Uhr | TU Darmstadt, Gebäude S1|01, Karolinenplatz 5, Raum A 02

Veranstalter: Fachbereich Informatik, Fachgebiet Knowledge Engineering Group


Extreme multi-label classification (XMLC) refers to learning problems in which examples are tagged by a few relevant labels taken from a very large set, consisting of potentially millions of possible labels. It has been recently shown that, apart from automatic tagging, this framework can be leveraged to effectively address problems in ranking, recommendation systems and web advertising. In this talk, we will first discuss potential applications and challenges faced by XMLC. We will then present a theoretical framework for analyzing XMLC algorithms. Finally, we will focus on tree-based algorithms which seem to be the right tool for XMLC problems as they can be very efficient in time and space and competitive in terms of predictive performance.

We will distinguish two groups of algorithms: classical decision trees and label trees. We will shortly describe FastXML which represents the first group of algorithms. We will then contrast this approach with probabilistic label trees (PLTs) in which each label corresponds to one and only one path from the root to a leaf node. As the main result, we will show that PLTs are no-regret multi-label generalization of the popular hierarchical softmax (HSM), used in deep networks to deal with large-scale multi-class problems.


Krzysztof Dembczyński is an assistant professor at Poznan University of Technology, Poland. He received his B.Sc., M.Sc., and Ph.D. degrees in computer science from the same university. As a post-doctoral researcher he spent two years from 2009 to 2011 at Marburg University, Germany. In 2011-2013, he was receiving a stipend for outstanding young scientists funded by the Polish Ministry of Science and Higher Education. He was also a laureate of a prestigious scholarship from the Foundation for Polish Science in 2012-2014. He received his habilitation degree in 2018. His research interests span the field of machine learning, in particular, multi-label classification, preference learning, and extreme classification. His articles have been published at the main conferences (ECML, ICML, NIPS) and in the leading journals (JMLR, MLJ, DAMI) in the field of machine learning. As a co-author, he won the best paper award at ECAI in 2012 and at ACML in 2015. In 2013, he gave two tutorials on multi-target prediction problems at ICML and ALT/DS. Recently, he has been involved in the organization of several extreme classification events: NIPS 2017 Workshop, ECIR 2018 Tutorial, WWW 2018 Workshop, and the upcoming Dagstuhl Seminar. He serves as a program committee member at the major conferences in the field of artificial intelligence (ECML, ICML, NIPS, ECAI, AAAI, IJCAI) and as a reviewer for several international journals.

zur Liste