Deep bi-directional interaction network for sentence matching

被引:10
|
作者
Liu, Mingtong [1 ]
Zhang, Yujie [1 ]
Xu, Jinan [1 ]
Chen, Yufeng [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Traff Data Anal & Min, Sch Comp & Informat Technol, Beijing, Peoples R China
关键词
Sentence matching; Deep interaction network; Deep fusion; Attention mechanism; Multi-layer neural network; Interpretability study;
D O I
10.1007/s10489-020-02156-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of sentence matching is to determine the semantic relation between two sentences, which is the basis of many downstream tasks in natural language processing, such as question answering and information retrieval. Recent studies using attention mechanism to align the elements of two sentences have shown promising results in capturing semantic similarity/relevance. Most existing methods mainly focus on the design of multi-layer attention network, however, some critical issues have not been dealt with well: 1) the higher attention layer is easily affected by error propagation because it relies on the alignment results of preceding attentions; 2) models have the risk of losing low-layer semantic features with the increase of network depth; and 3) the approach of capturing global matching information brings about large computing complexity for model training. To this end, we propose a Deep Bi-Directional Interaction Network (DBDIN) to solve these issues, which captures semantic relatedness from two directions and each direction employs multiple attention-based interaction units. To be specific, the attention of each interaction unit will repeatedly focus on the original sentence representation of another one for semantic alignment, which alleviates the error propagation problem by attending to a fixed semantic representation. Then we design deep fusion to aggregate and propagate attention information from low layers to high layers, which effectively retains low-layer semantic features for subsequential interactions. Moreover, we introduce a self-attention mechanism at last to enhance global matching information with smaller model complexity. We conduct experiments on natural language inference and paraphrase identification tasks with three benchmark datasets SNLI, SciTail and Quora. Experimental results demonstrate that our proposed method can achieve significant improvements over baseline systems without using any external knowledge. Additionally, we conduct interpretable study to disclose how our deep interaction network with attention can benefit sentence matching, which provides a reference for future model design. Ablation studies and visualization analyses further verify that our model can better capture interactive information between two sentences, and the proposed components are indeed able to help modeling semantic relation more precisely.
引用
下载
收藏
页码:4305 / 4329
页数:25
相关论文
共 50 条
  • [31] Dynamics of a minimal consumer network with bi-directional influence
    Ekaterinchuk, Ekaterina
    Jungeilges, Jochen
    Ryazanova, Tatyana
    Sushko, Iryna
    COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION, 2018, 58 : 107 - 118
  • [32] BDNE: A Method of Bi-Directional Distance Network Embedding
    Zhu, Dongjie
    Sun, Yundong
    Cao, Ning
    Qiao, Xueming
    Xu, Ming
    Li, Jinlin
    Yang, Junzhou
    2019 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2019, : 158 - 161
  • [33] BENet: bi-directional enhanced network for image captioning
    Peixin Yan
    Zuoyong Li
    Rong Hu
    Xinrong Cao
    Multimedia Systems, 2024, 30
  • [34] Bi-Directional Link Multiplexing for MIMO Mesh Network
    Ono, Fumie
    Sakaguchi, Kei
    2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1524 - +
  • [35] Single-wavelength-pump bi-directional hybrid fiber amplifier for bi-directional local area network application
    Guo, Mars Ning
    Liaw, Shieri-Kuei
    Shum, Perry Ping
    Chen, Nan-Kuang
    Hung, Hsin-Kai
    Lin, Chinlon
    OPTICS COMMUNICATIONS, 2011, 284 (02) : 573 - 578
  • [36] Bi-attention network for bi-directional salient object detection
    Xu, Cheng
    Wang, Hui
    Liu, Xianhui
    Zhao, Weidong
    APPLIED INTELLIGENCE, 2023, 53 (19) : 21500 - 21516
  • [37] Deep recurrent neural network with multi-scale bi-directional propagation for video deblurring
    Zhu, Chao
    Dong, Hang
    Pan, Jinshan
    Liang, Boyang
    Huang, Yuhao
    Fu, Lean
    Wang, Fei
    arXiv, 2021,
  • [38] Deep Recurrent Neural Network with Multi-Scale Bi-directional Propagation for Video Deblurring
    Zhu, Chao
    Dong, Hang
    Pan, Jinshan
    Liang, Boyang
    Huang, Yuhao
    Fu, Lean
    Wang, Fei
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3598 - 3607
  • [39] The application of combined symbol matching to bi-directional wireless facsimile transmission
    So, WS
    Swift, S
    Prvulovic, G
    1996 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING - CONFERENCE PROCEEDINGS, VOLS I AND II: THEME - GLIMPSE INTO THE 21ST CENTURY, 1996, : 40 - 43
  • [40] Symmetric Interaction in Channel Allocation for Bi-Directional In-Band Full-Duplex Network
    Sakaguchi, Koichi
    Yamamoto, Koji
    Nishio, Takayuki
    Morikura, Masahiro
    2015 IEEE 26TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2015, : 1734 - 1739