DBPal
Natural language has long been a promising alternative query interface to databases that enable non-expert users to formulate complex questions in a more concise manner. Recently, deep learning techniques have gained traction as a way to translate natural language to SQL since similar ideas have been successful in related domains (e.g., English to Spanish). However, the core problem with existing deep learning approaches is that they require an enormous amount of training examples in order provide accurate translations. Such training data is extremely expensive to curate, since it generally requires humans to manually annotate natural language with SQL queries.
Based on these observations, we propose DBPal, a new approach that augments existing deep learning techniques in order to improve the performance of natural language to SQL translation. More specifically, we present a novel training pipeline that automatically generates synthetic training data in order to improve translation accuracy and create a model that is tailor made to the target database. As we show, our training pipeline applied to existing deep learning techniques is able to improve the accuracy of state-of-the-art natural language to SQL translation tasks.
Further Resources
Researchers
Name | Office | Phone | ||
---|---|---|---|---|
Nadja Geisler M.Sc. | S2|02 D110 | +49 6151 16-25603 | nadja.geisler@cs.tu-... |
![]() |
Benjamin Hättasch M.Sc. Doctoral Researcher | S2|02 D110 | +49 6151 16-25603 | benjamin.haettasch@cs.tu-... |
![]() |
Publications
Weir, Nathaniel ; Crotty, Andrew ; Galakatos, Alex ; Ilkhechi, Amir ; Ramaswamy, Shekar ; Bhushan, Rohin ; Cetintemel, Ugur ; Utama, Prasetya ; Geisler, Nadja ; Hättasch, Benjamin ; Eger, Steffen ; Binnig, Carsten (2019):
DBPal: Weak Supervision for Learning a Natural Language Interface to Databases.
Los Angeles, California, USA, 1st International Workshop on Conversational Access to Data (CAST) in conj. with the 45th International Conference on Very Large Data Bases (VLDB), Los Angeles, California, USA, [Konferenzveröffentlichung]
Basik, Fuat ; Hättasch, Benjamin ; Ilkhechi, Amir ; Usta, Arif ; Ramaswamy, Shekar ; Utama, Prasetya ; Weir, Nathaniel ; Binnig, Carsten ; Cetintemel, Ugur (2018):
DBPal: A Learned NL-Interface for Databases.
In: SIGMOD '18, S. 1765-1768, New York, NY, USA, ACM, Proceedings of the 2018 International Conference on Management of Data, New York, NY, USA, ISBN 978-1-4503-4703-7,
DOI: 10.1145/3183713.3193562,
[Konferenzveröffentlichung]
Utama, Prasetya ; Weir, Nathaniel ; Basik, Fuat ; Binnig, Carsten ; Cetintemel, Ugur ; Hättasch, Benjamin ; Ilkhechi, Amir ; Ramaswamy, Shekar ; Usta, Arif (2018):
An End-to-end Neural Natural Language Interface for Databases.
In: arXiv preprint arXiv:1804.00401, [Article]