Forum Duplicate Question Detection by Domain Adaptive Semantic Matching

被引:7
|
作者
Xu, Zhuojia [1 ]
Yuan, Hua [1 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Commun & Comp Network Lab Guangdong, Guangzhou 510641, Guangdong, Peoples R China
关键词
Community question answering; duplicate question detection; semantic matching; transfer learning;
D O I
10.1109/ACCESS.2020.2982268
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Community Question Answering (CQA) forums, such as Stack Overflow, Stack Exchange and Massive Open Online Course (MOOC) forums, spend a lot of manpower and time to manage duplicate questions on the forum. Mismatch of duplicate questions makes users keep asking & x201C;new & x201D; questions, and the continuous accumulation of duplicate questions may interfere with their information searching again, affecting user satisfaction. Neural Networks (NN) models for parsing semantics provide the possibility of end-to-end duplicate question detection. Whereas, due to lack of domain data and expertise, NN models for semantic parsing are rarely directly applied to CQA duplicate question detection. This paper proposes a Semantic Matching Model (SMM) integrated with the multi-task transfer learning framework for multi-domain forum duplicate question detection. By designing the word-to-sentence interaction mechanism based on the word-to-word interaction, SMM can automatically choose to ignore or pay attention to potential similar words according to the semantics at the sentence level. The experiments on the benchmark data set and MOOC forum data set state that SMM outperforms baselines, its interaction mechanism is effective and it has an advantage in cross-domain duplicate question detection.
引用
收藏
页码:56029 / 56038
页数:10
相关论文
共 50 条
  • [1] Adversarial Domain Adaptation for Duplicate Question Detection
    Shah, Darsh J.
    Lei, Tao
    Moschitti, Alessandro
    Romeo, Salvatore
    Nakov, Preslav
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1056 - 1063
  • [2] SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection
    Li, Wuyang
    Liu, Xinyu
    Yuan, Yixuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5281 - 5290
  • [3] SIGMA plus plus : Improved Semantic-Complete Graph Matching for Domain Adaptive Object Detection
    Li, Wuyang
    Liu, Xinyu
    Yuan, Yixuan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 9022 - 9040
  • [4] Learning Profiles in Duplicate Question Detection
    Saedi, Chakaveh
    Rodrigues, Joao
    Silva, Joao
    Branco, Antonio
    Maraev, Vladislav
    [J]. 2017 IEEE 18TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI 2017), 2017, : 544 - 550
  • [5] Duplicate document detection by template matching
    Caprari, RS
    [J]. IMAGE AND VISION COMPUTING, 2000, 18 (08) : 633 - 643
  • [6] MDedup: Duplicate Detection with Matching Dependencies
    Koumarelas, Ioannis
    Papenbrock, Thorsten
    Naumann, Felix
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (05): : 712 - 725
  • [7] Adaptive Windows for Duplicate Detection
    Draisbach, Uwe
    Naumann, Felix
    Szott, Sascha
    Wonneberg, Oliver
    [J]. 2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 1073 - 1083
  • [8] Adaptive Multi-Attention Network Incorporating Answer Information for Duplicate Question Detection
    Liang, Di
    Zhang, Fubao
    Zhang, Weidong
    Zhang, Qi
    Fu, Jinlan
    Peng, Minlong
    Gui, Tao
    Huang, Xuanjing
    [J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 95 - 104
  • [9] A BERT-Based Semantic Matching Ranker for Open-Domain Question Answering
    Xu, Shiyi
    Liu, Feng
    Huang, Zhen
    Peng, Yuxing
    Li, Dongsheng
    [J]. 2020 4TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2020, 2020, : 31 - 36
  • [10] Semantic matching for the medical domain
    Shamdasani, Jetendr
    Bloodsworth, Peter
    McClatchey, Richard
    [J]. SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 198 - 202