Algorithm for Processing Queries that Involve Boolean Columns for a Natural Language Interface to Databases

Rodolfo A. Pazos R., José A. Martínez F., J. Javier Gonzalez B., Andrés A. Verástegui O.


In the last decades, the use of natural language interfaces to databases (NLIDBs) has increased exponentialy; unfortunately, the complexity of natural language has limited their effectiveness. The presence of Boolean columns in databases increases the difficulty for translating natural language queries to SQL. A Boolean column is a column that can only store two possible values: true/false, yes/no, 1/0. The problem for processing queries that involve Boolean columns, is that the search value for these columns (true/false, yes/no, 1/0) is not explicit in the queries. This problem makes NLIDBs generate erroneous translations as shown in experimental tests. A survey of the literature on NLIDBs has shown that this problem has not been identified, much less addressed. In this article, a new algorithm for processing queries that involve Boolean columns is presented. The algorithm uses syntactic and semantic information that facilitates detecting Boolean columns and their implicit values in a query. The experimental tests show that it is highly effective for translating this type of queries.


Natural language interfaces to databases, natural language processing, databases, SQL

Full Text: PDF