Duplicate Question Detection based on Neural Networks and Multi-head Attention

被引:0
|
作者
Zhang, Heng [1 ]
Chen, Liangyu [1 ]
机构
[1] East China Normal Univ, Shanghai Key Lab Trustworthy Comp, Shanghai, Peoples R China
关键词
deep learning; multi-head attention; ensemble learning;
D O I
10.1109/ialp48816.2019.9037671
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is well known that using only one neural network can not get a satisfied accuracy for the problem of Duplicate Question Detection. In order to break through this dilemma, different neural networks are ensembled serially to strive for better accuracy. However, many problems, such as vanishing gradient or exploding gradient, will be encountered if the depth of neural network is blindly increased. Worse, the serial integration may be poor in computational performance since it is less parallelizable and needs more time to train. To solve these problems, we use ensemble learning with treating different neural networks as individual learners, calculating in parallel, and proposing a new voting mechanism to get better detection accuracy. In addition to the classical models based on recurrent or convolutional neural network, Multi Head Attention is also integrated to reduce the correlation and the performance gap between different models. The experimental results in Quora question pairs dataset show that the accuracy of our method can reach 89.3 %.
引用
收藏
页码:13 / 18
页数:6
相关论文
共 50 条
  • [31] Multi-Span Long-Haul Fiber Transmission Model Based on Cascaded Neural Networks With Multi-Head Attention Mechanism
    Zang, Yubin
    Yu, Zhenming
    Xu, Kun
    Chen, Minghua
    Yang, Sigang
    Chen, Hongwei
    [J]. JOURNAL OF LIGHTWAVE TECHNOLOGY, 2022, 40 (19) : 6347 - 6358
  • [32] Interpretable duplicate question detection models based on attention mechanism
    Zhou, Qifeng
    Liu, Xiang
    Wang, Qing
    [J]. INFORMATION SCIENCES, 2021, 543 : 259 - 272
  • [33] Financial Volatility Forecasting: A Sparse Multi-Head Attention Neural Network
    Lin, Hualing
    Sun, Qiubi
    [J]. INFORMATION, 2021, 12 (10)
  • [34] Sparse Autoencoder-based Multi-head Deep Neural Networks for Machinery Fault Diagnostics with Detection of Novelties
    Yang, Zhe
    Gjorgjevikj, Dejan
    Long, Jianyu
    Zi, Yanyang
    Zhang, Shaohui
    Li, Chuan
    [J]. CHINESE JOURNAL OF MECHANICAL ENGINEERING, 2021, 34 (01)
  • [35] Bilinear Multi-Head Attention Graph Neural Network for Traffic Prediction
    Hu, Haibing
    Han, Kai
    Yin, Zhizhuo
    [J]. ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2022, : 33 - 43
  • [36] Multi-Head Attention for End-to-End Neural Machine Translation
    Fung, Ivan
    Mak, Brian
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 250 - 254
  • [37] Sparse Autoencoder-based Multi-head Deep Neural Networks for Machinery Fault Diagnostics with Detection of Novelties
    Zhe Yang
    Dejan Gjorgjevikj
    Jianyu Long
    Yanyang Zi
    Shaohui Zhang
    Chuan Li
    [J]. Chinese Journal of Mechanical Engineering, 2021, 34
  • [38] Sparse Autoencoder-based Multi-head Deep Neural Networks for Machinery Fault Diagnostics with Detection of Novelties
    Zhe Yang
    Dejan Gjorgjevikj
    Jianyu Long
    Yanyang Zi
    Shaohui Zhang
    Chuan Li
    [J]. Chinese Journal of Mechanical Engineering, 2021, 34 (03) : 159 - 170
  • [39] Neural Linguistic Steganalysis via Multi-Head Self-Attention
    Jiao, Sai-Mei
    Wang, Hai-feng
    Zhang, Kun
    Hu, Ya-qi
    [J]. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2021, 2021
  • [40] Deception Detection Algorithm Based on Global and Local Feature Fusion with Multi-head Attention
    Kang, Jian
    Qu, Wen
    Cui, Shaoxing
    Feng, Xiaoyi
    [J]. 2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 162 - 168