Self-attention Based Text Matching Model with Generative Pre-training

被引：1

作者：

Zhang, Xiaolin ^{[1
]}

Lei, Fengpei ^{[1
]}

Yu, Shengji ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China

来源：

2021 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS DASC/PICOM/CBDCOM/CYBERSCITECH 2021 | 2021年

关键词：

deep learning; text matching; variational autoencoder; depth-wise separable convolutions; self-attention;

D O I：

10.1109/DASC-PICom-CBDCom-CyberSciTech52372.2021.00027

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Text matching is an important method to judge the semantic similarity of different sentences. Improving the efficiency and accuracy of text matching is the most focus in the field of information matching. In recent years, deep learning has been widely applied to text matching tasks and achieved good results. However, the different models have different limitations, such as CNN cannot learn global semantic information well, RNN cannot be parallelized well, and large pre-training language models have too many parameters to be deployed on hardware well. To address these problems, this paper propose a self-attention based text matching model with generative pre-training. Self-attention mechanism is adopted to learn the semantic information between words in a sentence, and can achieve better parallelization. We use the deep separable convolution model to obtain local features. In the pretraining stage of this model, a generative model variational autoencoder is used to learn the semantic relationship between similar sentences. And in the downstream text matching model, we employ Siamese Network structure, combine depth-wise separable convolutions and self-attention mechanism for feature extraction, and use attention mechanism for text interaction, in which the parameters in the pre-training phase will be shared. At last, we evaluate our model on three datasets: LCQMC, QQP, and a securities dataset. Experiment results show that our method achieves pretty good performance.

引用

页码：84 / 91

页数：8

共 50 条

[1] Multilingual Constituency Parsing with Self-Attention and Pre-Training
Kitaev, Nikita
Cao, Steven
Klein, Dan
[J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3499 - 3505
[2] A neighborhood-aware graph self-attention mechanism-based pre-training model for Knowledge Graph Reasoning
Wu, Yuejia
Zhou, Jian-tao
[J]. INFORMATION SCIENCES, 2023, 647
[3] MolXPT: Wrapping Molecules with Text for Generative Pre-training
Liu, Zequn
Zhang, Wei
Xia, Yingce
Wu, Lijun
Xie, Shufang
Qin, Tao
Zhang, Ming
Liu, Tie-Yan
[J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1606 - 1616
[4] A Text Sentiment Analysis Model Based on Self-Attention Mechanism
Ji, Likun
Gong, Ping
Yao, Zhuyu
[J]. 2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 33 - 37
[5] A Self-attention Based Model for Offline Handwritten Text Recognition
Nam Tuan Ly
Trung Tan Ngo
Nakagawa, Masaki
[J]. PATTERN RECOGNITION, ACPR 2021, PT II, 2022, 13189 : 356 - 369
[6] Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Xue, Hongwei
Huang, Yupan
Liu, Bei
Peng, Houwen
Fu, Jianlong
Li, Houqiang
Luo, Jiebo
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[7] POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training
Zhang, Yizhe
Wang, Guoyin
Li, Chunyuan
Gan, Zhe
Brockett, Chris
Dolan, Bill
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 8649 - 8670
[8] Modal parameter estimation of turbulence response based on self-attention generative model
Duan, Shiqiang
Zheng, Hua
Yu, Jinge
Wu, Yafeng
[J]. JOURNAL OF VIBRATION AND CONTROL, 2023,
[9] Self-attention based Text Knowledge Mining for Text Detection
Wan, Qi
Ji, Haoqin
Shen, Linlin
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5979 - 5988
[10] Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering
Gong, Haifan
Chen, Guanqi
Liu, Sishuo
Yu, Yizhou
Li, Guanbin
[J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 456 - 460

← 1 2 3 4 5 →