Deep bi-directional interaction network for sentence matching

被引:10
|
作者
Liu, Mingtong [1 ]
Zhang, Yujie [1 ]
Xu, Jinan [1 ]
Chen, Yufeng [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Sch Comp & Informat Technol, Beijing, Peoples R China
关键词
Sentence matching; Deep interaction network; Deep fusion; Attention mechanism; Multi-layer neural network; Interpretability study;
D O I
10.1007/s10489-020-02156-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of sentence matching is to determine the semantic relation between two sentences, which is the basis of many downstream tasks in natural language processing, such as question answering and information retrieval. Recent studies using attention mechanism to align the elements of two sentences have shown promising results in capturing semantic similarity/relevance. Most existing methods mainly focus on the design of multi-layer attention network, however, some critical issues have not been dealt with well: 1) the higher attention layer is easily affected by error propagation because it relies on the alignment results of preceding attentions; 2) models have the risk of losing low-layer semantic features with the increase of network depth; and 3) the approach of capturing global matching information brings about large computing complexity for model training. To this end, we propose a Deep Bi-Directional Interaction Network (DBDIN) to solve these issues, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units. To be specific, the attention of each interaction unit will repeatedly focus on the original sentence representation of another one for semantic alignment, which alleviates the error propagation problem by attending to a fixed semantic representation. Then we design deep fusion to aggregate and propagate attention information from low layers to high layers, which effectively retains low-layer semantic features for subsequential interactions. Moreover, we introduce a self-attention mechanism at last to enhance global matching information with smaller model complexity. We conduct experiments on natural language inference and paraphrase identification tasks with three benchmark datasets SNLI, SciTail and Quora. Experimental results demonstrate that our proposed method can achieve significant improvements over baseline systems without using any external knowledge. Additionally, we conduct interpretable study to disclose how our deep interaction network with attention can benefit sentence matching, which provides a reference for future model design. Ablation studies and visualization analyses further verify that our model can better capture interactive information between two sentences, and the proposed components are indeed able to help modeling semantic relation more precisely.
引用
下载
收藏
页码:4305 / 4329
页数:25
相关论文
共 50 条
  • [41] Bi-Directional Dynamic Interaction Network for Cross-Modality Person Re-Identification
    Zheng A.
    Feng M.
    Li C.
    Tang J.
    Luo B.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (03): : 371 - 382
  • [42] A Bi-directional Interface linking a Dialysis Network with a Clinical Information Network
    Allen, Glen
    Korossy, Steve
    Frost, Rubin
    Barbara, Jeffrey A. J.
    ELECTRONIC JOURNAL OF HEALTH INFORMATICS, 2009, 4 (01):
  • [43] Prediction of rebound in shotcrete using deep bi-directional LSTM
    Suzen, Ahmet A.
    Cakiroglu, Melda A.
    COMPUTERS AND CONCRETE, 2019, 24 (06): : 555 - 560
  • [44] Application of bi-directional static loading test to deep foundations
    Dai, Guoliang
    Gong, Weiming
    JOURNAL OF ROCK MECHANICS AND GEOTECHNICAL ENGINEERING, 2012, 4 (03) : 269 - 275
  • [45] A deep bi-directional prediction model for live streaming recommendation
    Zhang, Shuai
    Liu, Hongyan
    He, Jun
    Han, Sanpu
    Du, Xiaoyong
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (02)
  • [46] Bi-directional Image–Text Matching Deep Learning-Based Approaches: Concepts, Methodologies, Benchmarks and Challenges
    Doaa B. Ebaid
    Magda M. Madbouly
    Adel A. El-Zoghabi
    International Journal of Computational Intelligence Systems, 16
  • [48] DEEP BI-DIRECTIONAL RECURRENT NETWORKS OVER SPECTRAL WINDOWS
    Mohamed, Abdel-rahman
    Seide, Frank
    Yu, Dong
    Droppo, Jasha
    Stolcke, Andreas
    Zweig, Geoffrey
    Penn, Gerald
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 78 - 83
  • [49] Deep Stereo Image Compression via Bi-directional Coding
    Lei, Jianjun
    Liu, Xiangrui
    Peng, Bo
    Jin, Dengchao
    Li, Wanqing
    Gu, Jingxiao
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19637 - 19646
  • [50] A bi-directional data link into the deep sea - The DOMEST project
    Meinecke, C
    Ratmeyer, V
    Wefer, G
    OPERATIONAL OCEANOGRAPHY: IMPLEMENTATION AT THE EUROPEAN AND REGIONAL SCALES, 2002, 66 : 299 - 306