Deep bi-directional interaction network for sentence matching

被引:10
|
作者
Liu, Mingtong [1 ]
Zhang, Yujie [1 ]
Xu, Jinan [1 ]
Chen, Yufeng [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Sch Comp & Informat Technol, Beijing, Peoples R China
关键词
Sentence matching; Deep interaction network; Deep fusion; Attention mechanism; Multi-layer neural network; Interpretability study;
D O I
10.1007/s10489-020-02156-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of sentence matching is to determine the semantic relation between two sentences, which is the basis of many downstream tasks in natural language processing, such as question answering and information retrieval. Recent studies using attention mechanism to align the elements of two sentences have shown promising results in capturing semantic similarity/relevance. Most existing methods mainly focus on the design of multi-layer attention network, however, some critical issues have not been dealt with well: 1) the higher attention layer is easily affected by error propagation because it relies on the alignment results of preceding attentions; 2) models have the risk of losing low-layer semantic features with the increase of network depth; and 3) the approach of capturing global matching information brings about large computing complexity for model training. To this end, we propose a Deep Bi-Directional Interaction Network (DBDIN) to solve these issues, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units. To be specific, the attention of each interaction unit will repeatedly focus on the original sentence representation of another one for semantic alignment, which alleviates the error propagation problem by attending to a fixed semantic representation. Then we design deep fusion to aggregate and propagate attention information from low layers to high layers, which effectively retains low-layer semantic features for subsequential interactions. Moreover, we introduce a self-attention mechanism at last to enhance global matching information with smaller model complexity. We conduct experiments on natural language inference and paraphrase identification tasks with three benchmark datasets SNLI, SciTail and Quora. Experimental results demonstrate that our proposed method can achieve significant improvements over baseline systems without using any external knowledge. Additionally, we conduct interpretable study to disclose how our deep interaction network with attention can benefit sentence matching, which provides a reference for future model design. Ablation studies and visualization analyses further verify that our model can better capture interactive information between two sentences, and the proposed components are indeed able to help modeling semantic relation more precisely.
引用
下载
收藏
页码:4305 / 4329
页数:25
相关论文
共 50 条
  • [1] Deep bi-directional interaction network for sentence matching
    Mingtong Liu
    Yujie Zhang
    Jinan Xu
    Yufeng Chen
    Applied Intelligence, 2021, 51 : 4305 - 4329
  • [2] Bi-directional attention comparison for semantic sentence matching
    Huiyuan Lai
    Yizheng Tao
    Chunliu Wang
    Lunfan Xu
    Dingyong Tang
    Gongliang Li
    Multimedia Tools and Applications, 2020, 79 : 14609 - 14624
  • [3] Bi-directional attention comparison for semantic sentence matching
    Lai, Huiyuan
    Tao, Yizheng
    Wang, Chunliu
    Xu, Lunfan
    Tang, Dingyong
    Li, Gongliang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (21-22) : 14609 - 14624
  • [4] Bi-directional Maximal Matching Algorithm to Segment Khmer Words in Sentence
    Mao, Makara
    Peng, Sony
    Yang, Yixuan
    Park, Doo-Soon
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2022, 18 (04): : 549 - 561
  • [5] Bi-directional Interaction Network for Person Search
    Dong, Wenkai
    Zhang, Zhaoxiang
    Song, Chunfeng
    Tan, Tieniu
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2836 - 2845
  • [6] BI-DIRECTIONAL LABELED POINT MATCHING
    Bhagalia, Roshni
    Miller, James V.
    Roy, Arunabha
    2010 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: FROM NANO TO MACRO, 2010, : 380 - 383
  • [7] A Deep Bi-directional Attention Network for Human Motion Recovery
    Cui, Qiongjie
    Sun, Huaijiang
    Li, Yupeng
    Kong, Yue
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 701 - 707
  • [8] Deep Bi-Directional LSTM Network for Query Intent Detection
    Sreelakshmi, K.
    Rafeeque, P. C.
    Sreetha, S.
    Gayathri, E. S.
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 939 - 946
  • [9] Bi-directional lstm network speech-to-gesture generation using bi-directional lstm network
    Kaneko N.
    Takeuchi K.
    Hasegawa D.
    Shirakawa S.
    Sakuta H.
    Sumi K.
    Transactions of the Japanese Society for Artificial Intelligence, 2019, 34 (06):
  • [10] Towards bi-directional dancing interaction
    Reidsma, Dennis
    van Welbergen, Herwin
    Poppe, Ronald
    Bos, Pieter
    Nijholt, Anton
    ENTERTAINMENT COMPUTING - ICEC 2006, 2006, 4161 : 1 - +