Our paper “DBPal: Weak Supervision for Learning a Natural Language Interface to Databases” was accepted for the “Conversational Access to Data (CAST) Workshop” at VLDB 2019

A new system to translate natural language utterances into SQL statements using a neural machine translation model

2019/07/01

This paper describes DBPal, a new system to translate natural language utterances into SQL statements using a neural machine translation model.

While other recent approaches use neural machine translation to implement a Natural Language Interfaces to Databases (NLIDB), existing approaches rely on supervised learning with manually curated training data, resulting in a high overhead for supporting each new database schema. In order to avoid this issue, DBPal implements a novel training pipeline based on weak supervision that synthesizes all training data from a given database schema.

In our evaluation, we show that DBPal can outperform existing rule-based NLIDBs while achieving comparable performance to other NLIDBs that leverage deep neural network models but rely on manually curated training data for each database schema.