Learned Database Components
DeepDB is a data-driven approach for learned DBMS components which directly supports changes of the workload and data without the need of retraining. To achieve this, we learn deep probabilistic models over different parts of the database schema and show how to combine them efficiently.
We automate the partitioning of distributed DBMS for OLAP-style workloads using Deep Reinforcement Learning (DRL). The main idea is that a DRL agent learns the cost tradeoffs of different partitioning schemes and can thus automate the partitioning decision.
In this project, we incorporate domain knowledge to obtain more reliable learned DBMS components requiring less training data. While the high-level design of the DBMS component is still specified by code, we optimize it for a particular workload and hardware using differentiable programming.
It's AI Match: Schema Matching Using Embeddings
We propose a novel end-to-end approach for schema matching based on neural embeddings, in order to reduce the manual effort involved in data integration. The main idea is to use a two-step approach consisting of a table matching step followed by an attribute matching step. In both steps we use embeddings on different levels either representing the whole table or single attributes.