PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets

被引:9
|
作者
Djokic-Petrovic, Marija [1 ,2 ]
Cvjetkovic, Vladimir [2 ]
Yang, Jeremy [3 ,4 ]
Zivanovic, Marko [5 ]
Wild, David J. [3 ]
机构
[1] Virtual World Serv GmbH, Asperner Heldenpl 6, A-1220 Vienna, Austria
[2] Univ Kragujevac, Dept Math & Informat, Fac Sci, Radoja Domanov 12, Kragujevac 34000, Serbia
[3] Indiana Univ, Sch Informat & Comp, 901 E 10th St, Bloomington, IN 47408 USA
[4] Univ New Mexico, Sch Med, Translat Informat Div, Albuquerque, NM 87131 USA
[5] Univ Kragujevac, Dept Biol & Ecol, Fac Sci, Radoja Domanovica 12, Kragujevac 34000, Serbia
来源
关键词
Federated SPARQL query; Bioinformatics; Data integration; Ontologies; Data mining and information retrieval; RESOURCE; DATABASE; HCT-116; SYSTEM;
D O I
10.1186/s13326-017-0151-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. Results: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. Conclusions: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Web-Based Exploration of Photos with Time and Geospace
    Dinh Quyen Nguyen
    Schumann, Heidrun
    [J]. WEB INFORMATION SYSTEMS AND TECHNOLOGIES, WEBIST 2012, 2013, 140 : 153 - 166
  • [42] The exploration of the model of individualized Web-based instruction
    Hu, C
    Feng, WG
    Zhang, JP
    [J]. ADVANCED RESEARCH IN COMPUTERS AND COMMUNICATIONS IN EDUCATION, VOL 2: NEW HUMAN ABILITIES FOR THE NETWORKED SOCIETY, 1999, 55 : 772 - 775
  • [43] An Exploration of Web-based Monitoring: Implications for Design
    Kellar, Melanie
    Watters, Carolyn
    Inkpen, Kori M.
    [J]. CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1 AND 2, 2007, : 377 - 386
  • [44] AssociationDB: web-based exploration of genomic association
    Seelow, Dominik
    Hoffmann, Katrin
    Lindner, Tom H.
    [J]. BIOINFORMATICS, 2007, 23 (19) : 2643 - 2644
  • [45] A Web-based Platform for Clients and Designers to Prototype Web Sites
    Soutome, Tsukasa
    Ling, Dandy Kwong
    Niibori, Michitoshi
    Kamada, Masaru
    [J]. 2013 16TH INTERNATIONAL CONFERENCE ON NETWORK-BASED INFORMATION SYSTEMS (NBIS 2013), 2013, : 459 - 463
  • [46] Design and Implementation of Web-based Teaching Platform
    Wu, Qingtao
    Cao, Zaihui
    Zhang, Weixing
    [J]. 2011 INTERNATIONAL CONFERENCE ON FUTURE SOFTWARE ENGINEERING AND MULTIMEDIA ENGINEERING (FSME 2011), 2011, 7 : 75 - +
  • [47] Codeflex: A Web-based Platform for Competitive Programming
    Brito, Miguel
    Goncalves, Celestino
    [J]. 2019 14TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2019,
  • [48] Development of a web-based programming learning platform
    Su, Shih-Chieh
    Yu, Chih-Chang
    Lin, Chan-Hsien
    [J]. 2016 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2016,
  • [49] On the Construction of Web-based English Testing Platform
    Li, Qingqing
    Liang, Xia
    [J]. MECHANICAL, MATERIALS AND MANUFACTURING ENGINEERING, PTS 1-3, 2011, 66-68 : 2224 - +
  • [50] A Web-based platform for interdisciplinary biomedical research
    Schreier, Guenter
    Messmer, Juergen
    Rauchegger, Guenter
    Modre-Osprian, Robert
    Ladenstein, Ruth
    [J]. FRONTIERS IN BIOSCIENCE-LANDMARK, 2009, 14 : 2738 - 2746