A Heterogeneous Graph to Abstract Syntax Tree Framework for Text-to-SQL

被引:2
|
作者
Cao, Ruisheng [1 ]
Chen, Lu [1 ]
Li, Jieyu [1 ]
Zhang, Hanchong [1 ]
Xu, Hongshen [1 ]
Zhang, Wangyou [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, X LANCE Lab, MoE Key Lab Artificial Intelligence, Dept Comp Sci & Engn,AI Inst, Shanghai 200240, Peoples R China
关键词
Structured Query Language; Decoding; Databases; Syntactics; Semantics; Task analysis; Computational modeling; Abstract syntax tree; grammar-based constrained decoding; heterogeneous graph neural network; knowledge-driven natural language processing; permutation invariant problem; text; -to-SQL;
D O I
10.1109/TPAMI.2023.3298895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-to-SQL is the task of converting a natural language utterance plus the corresponding database schema into a SQL program. The inputs naturally form a heterogeneous graph while the output SQL can be transduced into an abstract syntax tree (AST). Traditional encoder-decoder models ignore higher-order semantics in heterogeneous graph encoding and introduce permutation biases during AST construction, thus incapable of exploiting the refined structure knowledge precisely. In this work, we propose a generic heterogeneous graph to abstract syntax tree (HG2AST) framework to integrate dedicated structure knowledge into statistics-based models. On the encoder side, we leverage a line graph enhanced encoder (LGESQL) to iteratively update both node and edge features through dual graph message passing and aggregation. On the decoder side, a grammar-based decoder first constructs the equivalent SQL AST and then transforms it into the desired SQL via post-processing. To avoid over-fitting permutation biases, we propose a golden tree-oriented learning (GTL) algorithm to adaptively control the expanding order of AST nodes. The graph encoder and tree decoder are combined into a unified framework through two auxiliary modules. Extensive experiments on various text-to-SQL datasets, including single/multi-table, single/cross-domain, and multilingual settings, demonstrate the superiority and broad applicability.
引用
收藏
页码:13796 / 13813
页数:18
相关论文
共 50 条
  • [21] SQL-to-Schema Enhances Schema Linking in Text-to-SQL
    Yang, Sun
    Su, Qiong
    Li, Zhishuai
    Li, Ziyue
    Mao, Hangyu
    Liu, Chenxi
    Zhao, Rui
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT I, DEXA 2024, 2024, 14910 : 139 - 145
  • [22] Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph
    Liu, Aiwei
    Hu, Xuming
    Lin, Li
    Wen, Lijie
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1021 - 1030
  • [23] Text-to-SQL: A methodical review of challenges and models
    Kanburoglu, Ali Bugra
    Tek, F. Boray
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2024, 32 (03) : 403 - 419
  • [24] Service-oriented Text-to-SQL Parsing
    Hu, Wangsu
    Tian, Jilei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2218 - 2222
  • [25] DuoRAT: Towards Simpler Text-to-SQL Models
    Scholale, Torsten
    Li, Raymond
    Bandanau, Dzmitry
    de Vries, Harm
    Pal, Chris
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1313 - 1321
  • [26] KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers
    Lee, Chia-Hsuan
    Polozov, Oleksandr
    Richardson, Matthew
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2261 - 2273
  • [27] RuleSQLova: Improving Text-to-SQL with Logic Rules
    Han, Shoukang
    Gao, Neng
    Guo, Xiaobo
    Shan, Yiwei
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [28] Towards Text-to-SQL over Aggregate Tables
    Li, Shuqin
    Zhou, Kaibin
    Zhuang, Zeyang
    Wang, Haofen
    Ma, Jun
    DATA INTELLIGENCE, 2023, 5 (02) : 457 - 474
  • [29] A survey on deep learning approaches for text-to-SQL
    Katsogiannis-Meimarakis, George
    Koutrika, Georgia
    VLDB JOURNAL, 2023, 32 (04): : 905 - 936
  • [30] Uncovering and Categorizing Social Biases in Text-to-SQL
    Liu, Yan
    Gao, Yan
    Su, Zhe
    Chen, Xiaokang
    Ash, Elliott
    Lou, Jian-Guang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 13573 - 13584