Efficient techniques to explore and rank paths in life science data sources

被引:0
|
作者
Lacroix, Zoé [1 ]
Raschid, Louiqa [2 ]
Vidal, Maria-Esther [3 ]
机构
[1] Arizona State University, Tempe, AZ 85287-6106, United States
[2] University of Maryland, College Park, MD 20742, United States
[3] Universidad Simon Bolivar, Caracas 1080, Venezuela
基金
英国医学研究理事会; 澳大利亚国家健康与医学研究理事会; 美国国家卫生研究院; 英国惠康基金;
关键词
Semantics - Polynomial approximation;
D O I
10.1007/978-3-540-24745-6_13
中图分类号
学科分类号
摘要
Life science data sources represent a complex link-driven federation of publicly available Web accessible sources. A fundamental need for scientists today is the ability to completely explore all relationships between scientific classes, e.g., genes and citations, that may be retrieved from various data sources. A challenge to such exploration is that each path between data sources potentially has different domain specific semantics and yields different benefit to the scientist. Thus, it is important to efficiently explore paths so as to generate paths with the highest benefits. In this paper, we explore the search space of paths that satisfy queries expressed as regular expressions. We propose an algorithm ESearch that runs in polynomial time in the size of the graph when the graph is acyclic. We present expressions to determine the benefit of a path based on metadata (statistics). We develop a heuristic search OnlyBestXX%. Finally, we compare OnlyBestXX% and ESearch. © Springer-Verlag 2004.
引用
收藏
页码:187 / 202
相关论文
共 50 条
  • [1] Efficient techniques to explore and rank paths in life science data sources
    Lacroix, Z
    Raschid, L
    Vidal, ME
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2004, 2994 : 187 - 202
  • [2] Links and paths through life sciences data sources
    Lacroix, Z
    Murthy, H
    Naumann, F
    Raschid, L
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2004, 2994 : 203 - 211
  • [3] Efficient techniques for range search queries on earth science data
    Shi, QM
    JaJa, JF
    14TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, PROCEEDINGS, 2002, : 142 - 151
  • [4] On the selection of k efficient paths by clustering techniques
    Caramia, Massimiliano
    Giordani, Stefano
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2009, 1 (03) : 237 - 260
  • [5] Search, access, and explore life science nanopublications on the Web
    Giachelle, Fabio
    Dosso, Dennis
    Silvello, Gianmaria
    PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 35
  • [6] Data-driven Rank Breaking for Efficient Rank Aggregation
    Khetan, Ashish
    Oh, Sewoong
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [7] Data-driven Rank Breaking for Efficient Rank Aggregation
    Khetan, Ashish
    Oh, Sewoong
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [8] The potential of secondary data sources to explore the life chances of looked-after children in the care system in the UK
    Attar, Shalhevet
    Parker, Gillian
    Wade, Jim
    JOURNAL OF CHILDRENS SERVICES, 2007, 2 (02) : 39 - 47
  • [9] Refined JS']JST Thesaurus Extended with Data from Other Open Life Science Data Sources
    Kushida, Tatsuya
    Tateisi, Yuka
    Masuda, Takeshi
    Watanabe, Katsutaro
    Matsumura, Katsuji
    Kawamura, Takahiro
    Kozaki, Kouji
    Takagi, Toshihisa
    SEMANTIC TECHNOLOGY, JIST 2017, 2017, 10675 : 35 - 48
  • [10] Data science and data analytics in life science research
    Bajorath, Juergen
    ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES, 2023, 3