Extended Seminar - AI for Data Management

This seminar is about how AI can be used for data management. This year, the seminar focuses on learned DBMS components as well as AI for data management-related tasks such as data engineering, better data access interfaces, and supporting data beyond structured tabular data.

In this seminar, you will:

– learn the basics of AI for data management in a short series of hands-on lectures,

– implement a case study yourself,

– and present a recent research paper from a relevant venue such as SIGMOD, VLDB or ICML, NeurIPS.

Organization

Last offered Winter Semester (23/24)
Lecturer Profs. Carsten Binnig, Dr. Manisha Luthra
Assistants Johannes Wehrstein, M.Sc., Uelison Santos, M.Sc., Jan-Micha Bodensohn, M.Sc.
Examination See Moodle
The kickoff meeting will be on the 17th of October, 2023 from 09:50-11:30, room S207/167

Course Infos

Below, you find some general information about the seminar. For all information regarding this year’s seminar (including important dates), please check the Moodle course linked above. Also make sure that you are registered in TUCaN.

Prerequisites:

You should have basic knowledge in machine learning and programming in Python and ideally C++. Advanced knowledge in data management and database systems from courses such as SDMS or ADMS is also helpful.

Seminar Topic:

Database management systems (DBMS) in the cloud are the backbone for managing large volumes of data efficiently and thus play a central role in business and science today. For providing high performance, many of the most complex DBMS components such as query optimizers or schedulers involve solving non-trivial problems.

To tackle such problems, very recent work has outlined a new direction of so-called learned DBMS components where AI-based methods are used to replace and enhance core DBMS components which has shown to provide significant performance benefits. This route is in particular interesting since Cloud vendors such as Google, Amazon, and Microsoft are already applying these techniques to optimize the performance their cloud data systems.

Furthermore, AI has also been used for improving many other data management related tasks such as data engineering tasks (e.g., error detection and correction in databases or data transformation and data augmentation) which typically cause high manual overhead and can be automated by the use of AI. Finally, AI has also been used for extending databases by better data access interfaces (e.g., natural languague querying and chatbots for data) or by supporting data beyond structured tabular data (i.e., text and images).