GTR: An SQL Generator With Transition Representation in Cross-Domain Database Systems

被引:0
|
作者
Qiao, Shaojie [1 ]
Liu, Chenxu [1 ]
Yang, Guoping [1 ]
Han, Nan [2 ]
Peng, Yuhan [1 ]
Wu, Lingchun [1 ]
Li, He [3 ]
Yuan, Guan [4 ]
机构
[1] Chengdu Univ Informat Technol, Sch Software Engn, Chengdu 610225, Peoples R China
[2] Chengdu Univ Informat Technol, Sch Management, Chengdu 610225, Peoples R China
[3] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
[4] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Automatic SQL generator; cross-domain database; grammar-based neural model; natural language (NL); NL-to-SQL learning system; transition representation (TR); TEXT-TO-SQL; NATURAL-LANGUAGE;
D O I
10.1109/TNNLS.2023.3309824
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies have focused on using natural language (NL) to automatically retrieve useful data from database (DB) systems. As an important component of autonomous DB systems, the NL-to-SQL technique can assist DB administrators in writing high-quality SQL statements and make persons with no SQL background knowledge learn complex SQL languages. However, existing studies cannot deal with the issue that the expression of NL inevitably mismatches the implementation details of SQLs, and the large number of out-of-domain (OOD) words makes it difficult to predict table columns. In particular, it is difficult to accurately convert NL into SQL in an end-to-end fashion. Intuitively, it facilitates the model to understand the relations if a "bridge" [transition representation (TR)] is employed to make it compatible with both NL and SQL in the phase of conversion. In this article, we propose an automatic SQL generator with TR called GTR in cross-domain DB systems. Specifically, GTR contains three SQL generation steps: 1) GTR learns the relation between questions and DB schemas; 2) GTR uses a grammar-based model to synthesize a TR; and 3) GTR predicts SQL from TR based on the rules. We conduct extensive experiments on two commonly used datasets, that is, WikiSQL and Spider. On the testing set of the Spider and WikiSQL datasets, the results show that GTR achieves 58.32% and 71.29% exact matching accuracy which outperforms the state-of-the-art methods, respectively.
引用
收藏
页码:17908 / 17920
页数:13
相关论文
共 50 条
  • [11] Tutorial on Cross-domain Recommender Systems
    Cantador, Ivan
    Cremonesi, Paolo
    PROCEEDINGS OF THE 8TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'14), 2014, : 401 - 402
  • [12] Counterfactual Representation Augmentation for Cross-Domain Sentiment Analysis
    Wang, Ke
    Wan, Xiaojun
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1979 - 1990
  • [13] Discriminative Representation Learning for Cross-Domain Sentiment Classification
    Zhang, Shaokang
    Jiang, Lei
    Peng, Huailiang
    Dai, Qiong
    Tan, Jianlong
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II, 2021, 12713 : 54 - 66
  • [14] Discerning Canonical User Representation for Cross-Domain Recommendation
    Zhao, Siqian
    Sahebi, Sherry
    PROCEEDINGS OF THE EIGHTEENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2024, 2024, : 318 - 328
  • [15] Disentangled Representation for Cross-Domain Medical Image Segmentation
    Wang, Jie
    Zhong, Chaoliang
    Feng, Cheng
    Zhang, Ying
    Sun, Jun
    Yokota, Yasuto
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [16] Soft Semantic Representation for Cross-Domain Face Recognition
    Peng, Chunlei
    Wang, Nannan
    Li, Jie
    Gao, Xinbo
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 346 - 360
  • [17] A Compact Representation for Cross-Domain Short Text Clustering
    Nunez-Reyes, Alba
    Villatoro-Tello, Esau
    Ramirez-de-la-Rosa, Gabriela
    Sanchez-Sanchez, Christian
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2016, PT I, 2017, 10061 : 16 - 26
  • [18] DaGzang: a synthetic data generator for cross-domain recommendation services
    Nguyen, Luong Vuong
    Vo, Nam D.
    Jung, Jason J.
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [19] Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing
    Wu, Kun
    Wang, Lijie
    Li, Zhenghua
    Zhang, Ao
    Xiao, Xinyan
    Wu, Hua
    Zhang, Min
    Wang, Haifeng
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8974 - 8983
  • [20] UCG: A Universal Cross-Domain Generator for Transferable Adversarial Examples
    Li, Zhankai
    Wang, Weiping
    Li, Jie
    Chen, Kai
    Zhang, Shigeng
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3023 - 3037