Text similarity: an alternative way to search MEDLINE

被引:60
|
作者
Lewis, James [1 ]
Ossowski, Stephan [1 ]
Hicks, Justin [1 ]
Errami, Mounir [1 ]
Garner, Harold R. [1 ]
机构
[1] Univ Texas, SW Med Ctr, Eugene McDermott Ctr Human Growth & Dev, Div Translat Res, Dallas, TX 75390 USA
关键词
D O I
10.1093/bioinformatics/btl388
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The most widely used literature search techniques, such as those offered by NCBI's PubMed system, require significant effort on the part of the searcher, and inexperienced searchers do not use these systems as effectively as experienced users. Improved literature search engines can save researchers time and effort by making it easier to locate the most important and relevant literature. Results: We have created and optimized a new, hybrid search system for Medline that takes natural text as input and then delivers results with high precision and recall. The combination of a fast, low-sensitivity weighted keyword-based first pass algorithm to cast a wide net to gather an initial set of literature, followed by a unique sentence-alignment based similarity algorithm to rank order those results was developed that is sensitive, fast and easy to use. Several text similarity search algorithms, both standard and novel, were implemented and tested in order to determine which obtained the best results in information retrieval exercises.
引用
收藏
页码:2298 / 2304
页数:7
相关论文
共 50 条
  • [41] Similarity Search Methods As an Alternative to Sub-Type Characterisation in Aggressive Lymphomas
    Sha, Chulin
    Barrans, Sharon
    Jack, Andrew
    Burton, Cathy H.
    Smith, Alexandra
    Roman, Eve
    Painter, Dan
    Crouch, Simon
    Tooze, Reuben M.
    Care, Matthew
    Westhead, David R.
    BLOOD, 2016, 128 (22)
  • [42] AN INVESTIGATION OF THE OPTIMIZATION OF SEARCH LOGIC FOR THE MEDLINE DATABASE
    HEINE, MH
    TAGUE, JM
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1991, 42 (04): : 267 - 278
  • [43] Using Discourse Analysis to Improve Text Categorization in MEDLINE
    Ruch, Patrick
    Geissbuehler, Antoine
    Gobeill, Julien
    Lisacek, Frederic
    Tbahriti, Imad
    Veuthey, Anne-Lise
    Aronson, Alan R.
    MEDINFO 2007: PROCEEDINGS OF THE 12TH WORLD CONGRESS ON HEALTH (MEDICAL) INFORMATICS, PTS 1 AND 2: BUILDING SUSTAINABLE HEALTH SYSTEMS, 2007, 129 : 710 - +
  • [44] Bidirectional String Anchors for Improved Text Indexing and Top-K Similarity Search
    Loukides, Grigorios
    Pissis, Solon P.
    Sweering, Michelle
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11093 - 11111
  • [45] THE MEDLINE FULL-TEXT RESEARCH-PROJECT
    MCKININ, EJ
    SIEVERT, ME
    JOHNSON, ED
    MITCHELL, JA
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1991, 42 (04): : 297 - 307
  • [46] An improved measuring similarity for short text snippets and its application in clustering search engine
    Li, Zhao
    Peng, Hong
    Peng, Peng
    Jia, Xi-Ping
    Wang, Jia-Bing
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 1581 - 1585
  • [47] COMPARISON OF SEARCH STRATEGIES ON CD PLUS MEDLINE
    WRIGHT, LC
    SUTHERLAND, HJ
    JACKSON, JI
    TILL, JE
    CANADIAN MEDICAL ASSOCIATION JOURNAL, 1991, 145 (05) : 457 - 464
  • [48] Search results outliers among MEDLINE platforms
    Burns, Christopher Sean
    Shapiro, Robert M., II
    Nix, Tyler
    Huber, Jeffrey T.
    JOURNAL OF THE MEDICAL LIBRARY ASSOCIATION, 2019, 107 (03) : 364 - 373
  • [49] Age-specific search strategies for Medline
    Kastner, Monika
    Wilczynski, Nancy L.
    Walker-Dilks, Cindy
    McKibbon, Kathleen Ann
    Haynes, Brian
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2006, 8 (04) : e25
  • [50] COMPARISON OF SEARCH STRATEGIES ON CD PLUS MEDLINE
    GANNON, RP
    CANADIAN MEDICAL ASSOCIATION JOURNAL, 1992, 146 (02) : 100 - 101