Learning Question Paraphrases for QA from Encarta Logs

被引:0
|
作者
Zhao, Shiqi [1 ]
Zhou, Ming
Liu, Ting [1 ]
机构
[1] Harbin Inst Technol, Informat Retrieval Lab, Harbin 150006, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question paraphrasing is critical in many Natural Language Processing (NLP) applications, especially for question reformulation in question answering (QA). However, choosing an appropriate data source and developing effective methods are challenging tasks. In this paper, we propose a method that exploits Encarta logs to automatically identify question paraphrases and extract templates. Questions from Encarta logs are partitioned into small clusters, within which a perceptron classier is used for identifying question paraphrases. Experiments are conducted and the results have shown: (1) Encarta log data is an eligible data source for question paraphrasing and the user clicks in the data are indicative clues for recognizing paraphrases; (2) the supervised method we present is effective, which can evidently outperform the unsupervised method. Besides, the features introduced to identify paraphrases are sound; (3) the obtained question paraphrase templates are quite effective in question reformulation, enhancing the MRR from 0.2761 to 0.4939 with the questions of TREC QA 2003.
引用
收藏
页码:1795 / 1800
页数:6
相关论文
共 50 条
  • [1] Learning Question Similarity in CQA from References and Query-logs
    Zhicharevich, Alex
    Shahar, Moni
    Shalom, Oren Sar
    [J]. ICPRAM: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2020, : 342 - 352
  • [2] Improving Encarta search engine performance by mining user logs
    Ling, CX
    Gao, JF
    Zhang, HJ
    Qian, WN
    Zhang, HJ
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2002, 16 (08) : 1101 - 1116
  • [3] Learning Scalar Adjective Intensity from Paraphrases
    Cocos, Anne
    Wharton, Skyler
    Pavlick, Ellie
    Apidianaki, Marianna
    Callison-Burch, Chris
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1752 - 1762
  • [4] Learning Probabilistic Sentence Representations from Paraphrases
    Chen, Mingda
    Gimpel, Kevin
    [J]. 5TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2020), 2020, : 17 - 23
  • [5] KBQA: Learning Question Answering over QA Corpora and Knowledge Bases
    Cui, Wanyun
    Xiao, Yanghua
    Wang, Haixun
    Song, Yangqiu
    Hwang, Seung-won
    Wang, Wei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (05): : 565 - 576
  • [6] To track QA work or not; That is the question
    Zang, Juanjuan
    [J]. AGILE PROCESSES IN SOFTWARE ENGINEERING AND EXTREME PROGRAMMING, PROCEEDINGS, 2008, 9 : 228 - 229
  • [7] PARAPHRASES - SEMANTIC STUDY, ROLE IN LEARNING
    VEZIN, L
    [J]. ANNEE PSYCHOLOGIQUE, 1976, 76 (01): : 177 - 197
  • [8] Web Logs and Question Answering
    Sutcliffe, Richard F. E.
    Kruschwitz, Udo
    Mandl, Thomas
    [J]. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : D1 - D7
  • [9] Commonsense Properties from Query Logs and Question Answering Forums
    Romero, Julien
    Razniewski, Simon
    Pal, Koninika
    Pan, Jeff Z.
    Sakhadeo, Archit
    Weikum, Gerhard
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1411 - 1420
  • [10] Improving QA Accuracy by Question Inversion
    Prager, John
    Duboue, Pablo
    Chu-Carroll, Jennifer
    [J]. COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, 2006, : 1073 - 1080