Searching Semantically Similar Questions from a Large Community-based Question Archive

被引:0
|
作者
Liu, Mingrong [1 ]
Liu, Yicen [1 ]
Yang, Qing [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100080, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper provides a novel and totally statistical method to search similar questions from a large question archive for a given queried question. Firstly, a word relevance model is trained based on the whole question archive which is made up of millions of natural language questions proposed by users on the web. The word relevance model is utilized to find most semantically related words to a specific word. Secondly, in order to find semantically similar questions for a queried question, each non-stop word in a question is expanded with the help of word relevance model and represented as a word vector. Elements of the vector include the word itself and some semantically related words to it. Elements of the word vector are weighted by combining both classical IR term weighting method and word transformation probability learned from the relevance model. Then the question is mapped to a question vector as the normalized center of the word vectors representing these words contained in it. The problem of question retrieval can be solved by comparing the similarity between question vectors. The method is actually a simple question expansion based Kernel approach. Experimental results indicate the proposed method outperforms the baseline methods such as Vector Space Model (VSM) and Language Model for Information Retrieval (LMIR).
引用
收藏
页码:485 / 492
页数:8
相关论文
共 50 条
  • [31] Suboptimal Management of Diabetes in India: Findings from a Large Community-Based Study
    Mohan, Sailesh
    Venkateshmurthy, Nikhil Srinivasapura
    Jarhyan, Prashant
    Sharnngadharan, Ganesh Kumar
    DIABETES, 2019, 68
  • [32] Exacerbations of Mild Asthma Are Common: Results From a Large Community-Based Sample
    Teague, W. Gerald
    Griffiths, Cameron D.
    JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY-IN PRACTICE, 2024, 12 (10): : 2717 - 2718
  • [33] Combining Q&A Pair Quality and Question Relevance Features on Community-based Question Retrieval
    Li, Dong
    Li, Lin
    Zhou, Dong
    2019 6TH INTERNATIONAL CONFERENCE ON BEHAVIORAL, ECONOMIC AND SOCIO-CULTURAL COMPUTING (BESC 2019), 2019,
  • [34] Community-based interventions: Taking on the cost and cost-effectiveness questions
    Siegel, JE
    Clancy, CM
    HEALTH SERVICES RESEARCH, 2000, 35 (05) : 905 - 909
  • [35] Integrating Issue Tracking Systems with Community-Based Question and Answering Websites
    Correa, Denzil
    Sureka, Ashish
    2013 22ND AUSTRALASIAN CONFERENCE ON SOFTWARE ENGINEERING (ASWEC), 2013, : 88 - 96
  • [36] Community-Based Question Answering via Heterogeneous Social Network Learning
    Fang, Hanyin
    Wu, Fei
    Zhao, Zhou
    Duan, Xinyu
    Zhuang, Yueting
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 122 - 128
  • [37] Convolutional Neural Tensor Network Architecture for Community-based Question Answering
    Qiu, Xipeng
    Huang, Xuanjing
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1305 - 1311
  • [38] Sequential Attention with Keyword Mask Model for Community-based Question Answering
    Yang, Jianxin
    Rong, Wenge
    Shi, Libin
    Xiong, Zhang
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2201 - 2211
  • [39] Duplicate question detection in community-based platforms via interaction networks
    Gao, Wang
    Yang, Baoping
    Xiao, Yue
    Zeng, Peng
    Hu, Xi
    Zhu, Xun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 10881 - 10898
  • [40] Duplicate question detection in community-based platforms via interaction networks
    Wang Gao
    Baoping Yang
    Yue Xiao
    Peng Zeng
    Xi Hu
    Xun Zhu
    Multimedia Tools and Applications, 2024, 83 : 10881 - 10898