Deep bi-directional interaction network for sentence matching

被引:10
|
作者
Liu, Mingtong [1 ]
Zhang, Yujie [1 ]
Xu, Jinan [1 ]
Chen, Yufeng [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Sch Comp & Informat Technol, Beijing, Peoples R China
关键词
Sentence matching; Deep interaction network; Deep fusion; Attention mechanism; Multi-layer neural network; Interpretability study;
D O I
10.1007/s10489-020-02156-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of sentence matching is to determine the semantic relation between two sentences, which is the basis of many downstream tasks in natural language processing, such as question answering and information retrieval. Recent studies using attention mechanism to align the elements of two sentences have shown promising results in capturing semantic similarity/relevance. Most existing methods mainly focus on the design of multi-layer attention network, however, some critical issues have not been dealt with well: 1) the higher attention layer is easily affected by error propagation because it relies on the alignment results of preceding attentions; 2) models have the risk of losing low-layer semantic features with the increase of network depth; and 3) the approach of capturing global matching information brings about large computing complexity for model training. To this end, we propose a Deep Bi-Directional Interaction Network (DBDIN) to solve these issues, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units. To be specific, the attention of each interaction unit will repeatedly focus on the original sentence representation of another one for semantic alignment, which alleviates the error propagation problem by attending to a fixed semantic representation. Then we design deep fusion to aggregate and propagate attention information from low layers to high layers, which effectively retains low-layer semantic features for subsequential interactions. Moreover, we introduce a self-attention mechanism at last to enhance global matching information with smaller model complexity. We conduct experiments on natural language inference and paraphrase identification tasks with three benchmark datasets SNLI, SciTail and Quora. Experimental results demonstrate that our proposed method can achieve significant improvements over baseline systems without using any external knowledge. Additionally, we conduct interpretable study to disclose how our deep interaction network with attention can benefit sentence matching, which provides a reference for future model design. Ablation studies and visualization analyses further verify that our model can better capture interactive information between two sentences, and the proposed components are indeed able to help modeling semantic relation more precisely.
引用
下载
收藏
页码:4305 / 4329
页数:25
相关论文
共 50 条
  • [21] Bi-directional evolutionary 3D topology optimization with a deep neural network
    Junseok Shin
    Cheol Kim
    Journal of Mechanical Science and Technology, 2022, 36 (7) : 3509 - 3519
  • [22] Learning a bi-directional discriminative representation for deep clustering
    Wang, Yiming
    Chang, Dongxia
    Fu, Zhiqiang
    Zhao, Yao
    PATTERN RECOGNITION, 2023, 137
  • [23] Implications of bi-directional interaction on seismic fragilities of structures
    Pramanik, Debdulal
    Banerjee, Abhik Kumar
    Roy, Rana
    COUPLED SYSTEMS MECHANICS, 2016, 5 (02): : 101 - 126
  • [24] A Novel Sentence Vector Generation Method Based on Autoencoder and Bi-directional LSTM
    Fukuda, Kiyohito
    Mori, Naoki
    Matsumoto, Keinosuke
    DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2019, 800 : 128 - 135
  • [25] An ANT Network Bi-directional Wireless Homecare System
    Hsueh, Ya-Hsin
    Chen, Kuan-Wei
    2008 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2008), VOLS 1-4, 2008, : 250 - 253
  • [26] OPTIMAL NETWORK BEAMFORMING FOR BI-DIRECTIONAL RELAY NETWORKS
    Havary-Nassab, Veria
    Shahbazpanahi, Shahram
    Grami, Ali
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 2277 - +
  • [27] Bi-Directional Cascade Network for Perceptual Edge Detection
    He, Jianzhong
    Zhang, Shiliang
    Yang, Ming
    Shan, Yanhu
    Huang, Tiejun
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3823 - 3832
  • [28] Bi-attention network for bi-directional salient object detection
    Cheng Xu
    Hui Wang
    Xianhui Liu
    Weidong Zhao
    Applied Intelligence, 2023, 53 : 21500 - 21516
  • [29] BENet: bi-directional enhanced network for image captioning
    Yan, Peixin
    Li, Zuoyong
    Hu, Rong
    Cao, Xinrong
    MULTIMEDIA SYSTEMS, 2024, 30 (01)
  • [30] Backpressure reduction with bi-directional data vortex network
    Yang, Q.
    2006 INTERNATIONAL CONFERENCE ON PHOTONICS IN SWITCHING, PROCEEDINGS, 2006, : 261 - 263