Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies

被引:9
|
作者
Duy Dinh [1 ]
Tamine, Lynda [1 ]
Boubekeur, Fatiha [2 ]
机构
[1] Univ Toulouse 3, Inst Rech Informat Toulouse, F-31062 Toulouse, France
[2] Mouloud Mammeri Univ, Dept Comp Sci, Tizi Ouzou 15000, Algeria
关键词
Multi-terminology indexing; Voting techniques; Document/query expansion; Concept extraction; Biomedical retrieval; QUERY EXPANSION; INFORMATION-RETRIEVAL; TEXT; DICTIONARY; GENE;
D O I
10.1016/j.artmed.2012.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: The aim of this work is to evaluate a set of indexing and retrieval strategies based on the integration of several biomedical terminologies on the available TREC Genomics collections for an ad hoc information retrieval (IR) task. Materials and methods: We propose a multi-terminology based concept extraction approach to selecting best concepts from free text by means of voting techniques. We instantiate this general approach on four terminologies (MeSH, SNOMED, ICD-10 and GO). We particularly focus on the effect of integrating terminologies into a biomedical IR process, and the utility of using voting techniques for combining the extracted concepts from each document in order to provide a list of unique concepts. Results: Experimental studies conducted on the TREC Genomics collections show that our multi-terminology IR approach based on voting techniques are statistically significant compared to the baseline. For example, tested on the 2005 TREC Genomics collection, our multi-terminology based IR approach provides an improvement rate of +6.98% in terms of MAP (mean average precision) (p<0.05) compared to the baseline. In addition, our experimental results show that document expansion using preferred terms in combination with query expansion using terms from top ranked expanded documents improve the biomedical IR effectiveness. Conclusion: We have evaluated several voting models for combining concepts issued from multiple terminologies. Through this study, we presented many factors affecting the effectiveness of biomedical IR system including term weighting, query expansion, and document expansion models. The appropriate combination of those factors could be useful to improve the IR performance. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:155 / 167
页数:13
相关论文
共 50 条
  • [21] An approach for document retrieval using cluster-based inverted indexing
    Chandwani, Gunjan
    Ahlawat, Anil
    Dubey, Gaurav
    JOURNAL OF INFORMATION SCIENCE, 2023, 49 (03) : 726 - 739
  • [22] Hybrid Indexing for Versioned Document Search with Cluster-based Retrieval
    Jin, Xin
    Agun, Daniel
    Yang, Tao
    Wu, Qinghao
    Shen, Yifan
    Zhao, Susen
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 377 - 386
  • [23] Using WordNet for Concept-Based Document Indexing in Information Retrieval
    Boubekeur, Fatiha
    Boughanem, Mohand
    Tamine, Lynda
    Daoud, Mariam
    SEMAPRO 2010: THE FOURTH INTERNATIONAL CONFERENCE ON ADVANCES IN SEMANTIC PROCESSING, 2010, : 151 - 157
  • [24] Indexing and retrieval of document images by spatial reasoning
    Punitha, P.
    Naveen
    Guru, D. S.
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2006, 4317 : 457 - +
  • [25] Arabic Document Indexing for Improved Text Retrieval
    Al-Lahham, Yaser A. M.
    2019 2ND INTERNATIONAL CONFERENCE ON NEW TRENDS IN COMPUTING SCIENCES (ICTCS), 2019, : 226 - 230
  • [26] PROJECT OF AN IRS WITH AUTOMATIC DOCUMENT INDEXING AND RETRIEVAL
    OTRADINSKII, VV
    NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1973, (10): : 28 - 29
  • [27] Anchor point indexing in Web document retrieval
    Kao, B
    Lee, J
    Ng, CY
    Cheung, D
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2000, 30 (03): : 364 - 373
  • [28] EXPERIMENTS WITH DOCUMENT COMPONENTS FOR INDEXING AND RETRIEVAL.
    Kwok, K.L.
    Kuan, William
    Information Processing and Management, 1988, 24 (04): : 405 - 417
  • [29] Relation-based document retrieval for biomedical literature databases
    Zhou, Xiaohua
    Hu, Xiaohua
    Lin, Xia
    Han, Hyoil
    Zhang, Xiaodan
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 689 - 701
  • [30] Indexing natural images for retrieval based on kansei factors
    Black, JA
    Kahol, K
    Tripathi, P
    Kuchi, P
    Panchanathan, S
    HUMAN VISION AND ELECTRONIC IMAGING IX, 2004, 5292 : 363 - 375