ReFSQL: A Retrieval-Augmentation Framework for Text-to-SQL Generation

被引:0
|
作者
Zhang, Kun [1 ,2 ]
Lin, Xiexiong [3 ]
Wang, Yuanzhuo [1 ,2 ,4 ]
Zhang, Xin [3 ]
Sun, Fei [1 ,2 ]
Cen, Jianhe [4 ]
Jiang, Xuhui [1 ,2 ]
Tan, Hexiang [1 ,2 ]
Shen, Huawei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Data Intelligence Syst Res Ctr, Beijing 100864, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing, Peoples R China
[3] Ant Grp, Hangzhou, Peoples R China
[4] Big Data Acad, Barcelona, Spain
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-SQL is the task that aims at translating natural language questions into SQL queries. Existing methods directly align the natural language with SQL Language and train one encoder-decoder-based model to fit all questions. However, they underestimate the inherent structural characteristics of SQL, as well as the gap between specific structure knowledge and general knowledge. This leads to structure errors in the generated SQL. To address the above challenges, we propose a retrieval-argument framework, namely ReFSQL. It contains two parts, structure-enhanced retriever and the generator. Structure-enhanced retriever is designed to identify samples with comparable specific knowledge in an unsupervised way. Subsequently, we incorporate the retrieved samples' SQL into the input, enabling the model to acquire prior knowledge of similar SQL grammar. To further bridge the gap between specific and general knowledge, we present a mahalanobis contrastive learning method, which facilitates the transfer of the sample toward the specific knowledge distribution constructed by the retrieved samples. Experimental results on five datasets verify the effectiveness of our approach in improving the accuracy and robustness of Text-to-SQL generation. Our framework has achieved improved performance when combined with many other backbone models (including the 11B flan-T5) and also achieved state-of-the-art performance when compared to existing methods that employ the fine-tuning approach.
引用
收藏
页码:664 / 673
页数:10
相关论文
共 50 条
  • [21] Prompting GPT-3.5 for Text-to-SQL with De-semanticization and Skeleton Retrieval
    Tian, Zhiliang (tianzhiliang@nudt.edu.cn), 1600, Springer Science and Business Media Deutschland GmbH (14326 LNAI):
  • [22] SQL-to-Schema Enhances Schema Linking in Text-to-SQL
    Yang, Sun
    Su, Qiong
    Li, Zhishuai
    Li, Ziyue
    Mao, Hangyu
    Liu, Chenxi
    Zhao, Rui
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 139 - 145
  • [23] Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations
    Tian, Yuan
    Zhang, Zheng
    Ning, Zheng
    Li, Toby Jia-Jun
    Kummerfeld, Jonathan K.
    Zhang, Tianyi
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 16149 - 16166
  • [24] Text-to-SQL: A methodical review of challenges and models
    Kanburoglu, Ali Bugra
    Tek, F. Boray
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2024, 32 (03) : 403 - 419
  • [25] Service-oriented Text-to-SQL Parsing
    Hu, Wangsu
    Tian, Jilei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2218 - 2222
  • [26] DuoRAT: Towards Simpler Text-to-SQL Models
    Scholale, Torsten
    Li, Raymond
    Bandanau, Dzmitry
    de Vries, Harm
    Pal, Chris
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1313 - 1321
  • [27] KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers
    Lee, Chia-Hsuan
    Polozov, Oleksandr
    Richardson, Matthew
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2261 - 2273
  • [28] RuleSQLova: Improving Text-to-SQL with Logic Rules
    Han, Shoukang
    Gao, Neng
    Guo, Xiaobo
    Shan, Yiwei
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [29] Towards Text-to-SQL over Aggregate Tables
    Li, Shuqin
    Zhou, Kaibin
    Zhuang, Zeyang
    Wang, Haofen
    Ma, Jun
    DATA INTELLIGENCE, 2023, 5 (02) : 457 - 474
  • [30] A survey on deep learning approaches for text-to-SQL
    Katsogiannis-Meimarakis, George
    Koutrika, Georgia
    VLDB JOURNAL, 2023, 32 (04): : 905 - 936