Best Student Paper Award for our paper "XAI - A Middleware for Scalable AI" which was accepted to Data 2019

2019/07/01

Best Student Paper Award for our paper “XAI – A Middleware for Scalable AI” which was accepted to Data 2019

Abdallah Salama received the Best Student Paper Award for his paper “XAI – A Middleware for Scalable AI” which runs on top of existing deep learning frameworks such as TensorFlow or MXNet and automates the hyper-parameter search for distributed deep

Data_2019_logo

A major obstacle for the adoption of deep neural networks (DNNs) is that the training can take multiple hours or days even with modern GPUs. In order to speed-up training of modern DNNs, recent deep learning frameworks support the distribution of the training process across multiple machines in a cluster of nodes. However, even if existing well-established models such as AlexNet or GoogleNet are being used, it is still a challenging task for data scientists to scale-out distributed deep learning in their environments and on their hardware resources.

In this paper, we present XAI, a middleware on top of existing deep learning frameworks such as MXNet and Tensorflow to easily scale-out distributed training of DNNs. The aim of XAI is that data scientists can use a simple interface to specify the model that needs to be trained and the resources available (e.g., number of machines, number of GPUs per machine, etc.). At the core of XAI, we have implemented a distributed optimizer that takes the model and the available cluster resources as input and finds a distributed setup of the training for the given model that best leverages the available resources. Our experiments show that XAI converges to a desired training accuracy 2x to 5x faster than default distribution setups in MXNet and TensorFlow.

go to list