Using proximity and tag weights for focused retrieval in structured documents

被引:2
|
作者
Beigbeder, Michel [1 ]
Gery, Mathias [2 ]
Largeron, Christine [2 ]
机构
[1] Ecole Natl Super Mines, F-42023 St Etienne, France
[2] Univ Lyon, St Etienne, France
关键词
Focused information retrieval; Structured information retrieval; Proximity; XML; Tags; TERM PROXIMITY;
D O I
10.1007/s10115-014-0767-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Focused information retrieval is concerned with the retrieval of small units of information. In this context, the structure of the documents as well as the proximity among query terms have been found useful for improving retrieval effectiveness. In this article, we propose an approach combining the proximity of the terms and the tags which mark these terms. Our approach is based on a Fetch and Browse method where the fetch step is performed with BM25 and the browse step with a structure enhanced proximity model. In this way, the ranking of a document depends not only upon the existence of the query terms within the document but also upon the tags which mark these terms. Thus, the document tends to be highly relevant when query terms are close together and are emphasized by tags. The evaluation of this model on a large XML structured collection provided by the INEX 2010 XML IR evaluation campaign shows that the use of term proximity and structure improves the retrieval effectiveness of BM25 in the context of focused information retrieval.
引用
收藏
页码:51 / 76
页数:26
相关论文
共 50 条
  • [1] Using proximity and tag weights for focused retrieval in structured documents
    Michel Beigbeder
    Mathias Géry
    Christine Largeron
    [J]. Knowledge and Information Systems, 2015, 44 : 51 - 76
  • [2] Structured storage and retrieval of SGML documents using Grove
    Kim, HG
    Cho, SB
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2000, 36 (04) : 643 - 657
  • [3] Abductive retrieval of structured documents
    Muller, AA
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 1997, 26 (01) : 15 - 28
  • [4] STORAGE AND RETRIEVAL OF STRUCTURED DOCUMENTS
    MACLEOD, IA
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1990, 26 (02) : 197 - 208
  • [5] Passage Retrieval on Structured Documents Using Graph Attention Networks
    Albarede, Lucas
    Mulhem, Philippe
    Goeuriot, Lorraine
    Le Pape-Gardeux, Claude
    Marie, Sylvain
    Chardin-Segui, Trinidad
    [J]. ADVANCES IN INFORMATION RETRIEVAL, PT II, 2022, 13186 : 13 - 21
  • [6] Annotation and retrieval of structured video documents
    Bertini, M
    Del Bimbo, A
    Nunziati, W
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 12 - 24
  • [7] Typed structured documents for information retrieval
    Dharap, C
    Bowman, CM
    [J]. PRINCIPLES OF DOCUMENT PROCESSING, 1997, 1293 : 135 - 151
  • [8] Semantic Proximity in Information Retrieval and Documents Classification
    Vishnyakov, Yury
    Vishnyakov, Renat
    [J]. 14TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS (CINTI), 2013, : 131 - 134
  • [9] CGKAT: A knowledge acquisition and retrieval tool using structured documents and ontologies
    Martin, P
    [J]. CONCEPTUAL STRUCTURES: FULFILLING PEIRCE'S DREAM, 1997, 1257 : 581 - 584
  • [10] A New Metric for Multimedia Retrieval in Structured Documents
    Fakhfakh, Sana
    Tmar, Mohamed
    Mahdi, Walid
    [J]. ICEIS: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 2, 2013, : 240 - 247