Field-weighted XML retrieval based on BM25

被引:0
|
作者
Lu, Wei [1 ]
Robertson, Stephen
MacFarlane, Andrew
机构
[1] Wuhan Univ, Sch Informat Management, Ctr Studies Informat Resources, Wuhan 430072, Peoples R China
[2] Microsoft Res, Cambridge, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This is the first year for the Centre for Interactive Systems Research participation of INEX. Based on a newly developed XML indexing and retrieval system on Okapi, we extend Robertson's field-weighted BM25F for document retrieval to element level retrieval function BM25E. In this paper, we introduce this new function and our experimental method in detail, and then show how we tuned weights for our selected fields by using INEX 2004 topics and assessments. Based on the tuned models we submitted our runs for CO.Thorough, CO.FetchBrowse, the methods we propose show real promise. Existing problems and future work are also discussed.
引用
收藏
页码:161 / 171
页数:11
相关论文
共 50 条
  • [11] BM25 With Exponential IDF for Instance Search
    Murata, Masaya
    Nagano, Hidehisa
    Mukai, Ryo
    Kashino, Kunio
    Satoh, Shin'ichi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (06) : 1690 - 1699
  • [12] Extending BM25 with Multiple Query Operators
    Blanco, Roi
    Boldi, Paolo
    [J]. SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 921 - 930
  • [13] Opinion Summarization for Short Texts based on BM25 and Syntactic Parsing
    Niu, Jianwei
    Zhao, Qingjuan
    Wang, Lei
    Chen, Huan
    Zheng, Shichao
    [J]. 2016 IEEE 14TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2016, : 1177 - 1180
  • [14] When Documents Are Very Long, BM25 Fails!
    Lv, Yuanhua
    Zhai, Chengxiang
    [J]. PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), 2011, : 1103 - 1104
  • [15] Duplication Detection for Software Bug Reports based on BM25 Term Weighting
    Yang, Cheng-Zen
    Du, Hung-Hsueh
    Wu, Sin-Sian
    Chen, Ing-Xiang
    [J]. 2012 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2012, : 33 - 38
  • [16] Improving the Sentiment Analysis Process of Spanish Tweets with BM25
    Sixto, Juan
    Almeida, Aitor
    Lopez-de-Ipina, Diego
    [J]. NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016, 2016, 9612 : 285 - 291
  • [17] Term frequency normalisation tuning for BM25 and DFR models
    He, B
    Ounis, I
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2005, 3408 : 200 - 214
  • [18] BM25-AH: Enhanced BM25 Algorithm for Domain-Specific Search Engine
    Kalian, Kirk
    Remig, Charles
    Jung, Youna
    [J]. IIWAS2019: THE 21ST INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES, 2019, : 631 - 634
  • [19] BM25-CTF: Improving TF and IDF factors in BM25 by using collection term frequencies
    Jimenez, Sergio
    Cucerzan, Silviu-Petru
    Gonzalez, Fabio A.
    Gelbukh, Alexander
    Duenas, George
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (05) : 2887 - 2899
  • [20] OnSeS: A Novel Online Short Text Summarization based on BM25 and Neural Network
    Niu, Jianwei
    Zhao, Qingjuan
    Wang, Lei
    Chen, Huan
    Atiquzzaman, Mohammed
    Peng, Fei
    [J]. 2016 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2016,