Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies

被引:9
|
作者
Duy Dinh [1 ]
Tamine, Lynda [1 ]
Boubekeur, Fatiha [2 ]
机构
[1] Univ Toulouse 3, Inst Rech Informat Toulouse, F-31062 Toulouse, France
[2] Mouloud Mammeri Univ, Dept Comp Sci, Tizi Ouzou 15000, Algeria
关键词
Multi-terminology indexing; Voting techniques; Document/query expansion; Concept extraction; Biomedical retrieval; QUERY EXPANSION; INFORMATION-RETRIEVAL; TEXT; DICTIONARY; GENE;
D O I
10.1016/j.artmed.2012.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: The aim of this work is to evaluate a set of indexing and retrieval strategies based on the integration of several biomedical terminologies on the available TREC Genomics collections for an ad hoc information retrieval (IR) task. Materials and methods: We propose a multi-terminology based concept extraction approach to selecting best concepts from free text by means of voting techniques. We instantiate this general approach on four terminologies (MeSH, SNOMED, ICD-10 and GO). We particularly focus on the effect of integrating terminologies into a biomedical IR process, and the utility of using voting techniques for combining the extracted concepts from each document in order to provide a list of unique concepts. Results: Experimental studies conducted on the TREC Genomics collections show that our multi-terminology IR approach based on voting techniques are statistically significant compared to the baseline. For example, tested on the 2005 TREC Genomics collection, our multi-terminology based IR approach provides an improvement rate of +6.98% in terms of MAP (mean average precision) (p<0.05) compared to the baseline. In addition, our experimental results show that document expansion using preferred terms in combination with query expansion using terms from top ranked expanded documents improve the biomedical IR effectiveness. Conclusion: We have evaluated several voting models for combining concepts issued from multiple terminologies. Through this study, we presented many factors affecting the effectiveness of biomedical IR system including term weighting, query expansion, and document expansion models. The appropriate combination of those factors could be useful to improve the IR performance. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:155 / 167
页数:13
相关论文
共 50 条
  • [1] BioDR: Semantic indexing networks for biomedical document retrieval
    Lourenco, Analia
    Carreira, Rafael
    Glez-Pena, Daniel
    Mendez, Jose R.
    Carneiro, Sonia
    Rocha, Luis M.
    Diaz, Fernando
    Ferreira, Eugenio C.
    Rocha, Isabel
    Fdez-Riverola, Florentino
    Rocha, Miguel
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (04) : 3444 - 3453
  • [2] Biomedical Text Mining Applied to Document Retrieval and Semantic Indexing
    Lourenco, Analia
    Carneiro, Sonia
    Ferreira, Eugenio C.
    Carreira, Rafael
    Rocha, Luis M.
    Glez-Pena, Daniel
    Mendez, Jose R.
    Fdez-Riverola, Florentino
    Diaz, Fernando
    Rocha, Isabel
    Rocha, Miguel
    DISTRIBUTED COMPUTING, ARTIFICIAL INTELLIGENCE, BIOINFORMATICS, SOFT COMPUTING, AND AMBIENT ASSISTED LIVING, PT II, PROCEEDINGS, 2009, 5518 : 954 - +
  • [3] Indexing Biomedical Documents with Bayesian Networks and Terminologies
    Chebil, Wiem
    Soualmia, Lina F.
    Omri, Mohamed Nazih
    Darmoni, Stefan J.
    2017 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (IEEE ISKE), 2017,
  • [4] Sense-Based Biomedical Indexing and Retrieval
    Dinh, Duy
    Tamine, Lynda
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 24 - 35
  • [5] THE EFFECTIVENESS OF A NONSYNTACTIC APPROACH TO AUTOMATIC PHRASE INDEXING FOR DOCUMENT-RETRIEVAL
    FAGAN, JL
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1989, 40 (02): : 115 - 132
  • [6] Web document indexing and retrieval
    Hyusein, B
    Patel, A
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 573 - 579
  • [7] Multiple Terminologies in a Health Portal: Automatic Indexing and Information Retrieval
    Darmoni, Stefan J.
    Pereira, Suzanne
    Sakji, Saoussen
    Merabti, Tayeb
    Prieur, Elise
    Joubert, Michel
    Thirion, Benoit
    ARTIFICIAL INTELLIGENCE IN MEDICINE, PROCEEDINGS, 2009, 5651 : 255 - +
  • [8] MODULAR INDEXING IN A RELATIONALLY BASED DOCUMENT-RETRIEVAL SYSTEM
    CRAWFORD, RG
    MACLEOD, IA
    CANADIAN JOURNAL OF INFORMATION SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION, 1981, 6 (JUN): : 67 - 75
  • [9] Optimization driven cluster based indexing and matching for the document retrieval
    Kayest, Mamta
    Jain, Sanjay Kumar
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (03) : 851 - 861
  • [10] Document Indexing Framework for Retrieval of Degraded Document Images
    Garg, Ritu
    Hassan, Ehtesham
    Chaudhury, Santanu
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1261 - 1265