Harnessing the Power of Metadata for Enhanced Question Retrieval in Community Question Answering

被引:0
|
作者
Ghasemi, Shima [1 ]
Shakery, Azadeh [1 ,2 ]
机构
[1] Univ Tehran, Coll Engn, Sch Elect & Comp Engn, Tehran 1439957131, Iran
[2] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran 193955746, Iran
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Community question answering; metadata; question retrieval;
D O I
10.1109/ACCESS.2024.3395449
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Community Question Answering (CQA) forums such as Yahoo! Answers and Stack Overflow have become popular. The main goal of a CQA is to provide the most suitable answer in the shortest possible time. Since there is a reach archive of answered questions, similar question retrieval has received much attention intending to answer questions immediately after asking. One of the main challenges in this task is the lexical gap between questions, which refers to the discrepancies between the terminologies used by users asking questions. In this paper, we use metadata and two transformer-based techniques to improve the translation-based language model as a traditional technique addressing the lexical gap in retrieval systems. To overcome the lexical gap problem, additional context and information about the questions can help. Metadata is a rich source of information that refers to supplementary data associated with each question. Subject, category, and answer are metadata used in this article. To leverage these metadata, two transformer-based methods are employed. First, to utilize category information, we build category-specific dictionaries to obtain more accurate translation probabilities. A BERT model predicts the categories of the questions. Second, to utilize answer information, we propose a question expansion technique. Expansion is done by a transformer-based model using a retrieval-augmented generation (RAG) model to generate answers and expand new questions with corresponding answers. Finally, candidate questions are ranked according to their similarity to the expanded new question. Our proposed method achieves 51.47 in terms of MAP, outperforming all state-of-the-art approaches in question retrieval.
引用
收藏
页码:65768 / 65779
页数:12
相关论文
共 50 条
  • [31] Arabic community question answering
    Nakov, Preslav
    Marquez, Lluis
    Moschitti, Alessandro
    Mubarak, Hamdy
    [J]. NATURAL LANGUAGE ENGINEERING, 2019, 25 (01) : 5 - 41
  • [32] Question recommendation and answer extraction in question answering community
    Xianfeng, Yang
    Pengfei, Liu
    [J]. International Journal of Database Theory and Application, 2016, 9 (01): : 35 - 44
  • [33] Learning to Rank for Question Routing in Community Question Answering
    Ji, Zongcheng
    Wang, Bin
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2363 - 2368
  • [34] Formulation of a hybrid expertise retrieval system in community question answering services
    Dipankar Kundu
    Deba Prasad Mandal
    [J]. Applied Intelligence, 2019, 49 : 463 - 477
  • [35] A topic inference based translation model for question retrieval in community-based question answering services
    Zhang, Wei-Nan
    Zhang, Yu
    Liu, Ting
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (02): : 313 - 321
  • [36] Formulation of a hybrid expertise retrieval system in community question answering services
    Kundu, Dipankar
    Mandal, Deba Prasad
    [J]. APPLIED INTELLIGENCE, 2019, 49 (02) : 463 - 477
  • [37] Learning semantic representation with neural networks for community question answering retrieval
    Zhou, Guangyou
    Zhou, Yin
    He, Tingting
    Wu, Wensheng
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 93 : 75 - 83
  • [38] Leveraging Structured Metadata for Improving Question Answering on the Web
    Du, Xinya
    Fourney, Adam
    Sim, Robert
    Cardie, Claire
    Bennett, Paul N.
    Awadallah, Ahmed Hassan
    [J]. 1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 551 - 556
  • [39] Scaling up Online Question Answering via Similar Question Retrieval
    Geigle, Chase
    Zhai, ChengXiang
    [J]. PROCEEDINGS OF THE THIRD (2016) ACM CONFERENCE ON LEARNING @ SCALE (L@S 2016), 2016, : 257 - 260