PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets

被引:9
|
作者
Djokic-Petrovic, Marija [1 ,2 ]
Cvjetkovic, Vladimir [2 ]
Yang, Jeremy [3 ,4 ]
Zivanovic, Marko [5 ]
Wild, David J. [3 ]
机构
[1] Virtual World Serv GmbH, Asperner Heldenpl 6, A-1220 Vienna, Austria
[2] Univ Kragujevac, Dept Math & Informat, Fac Sci, Radoja Domanov 12, Kragujevac 34000, Serbia
[3] Indiana Univ, Sch Informat & Comp, 901 E 10th St, Bloomington, IN 47408 USA
[4] Univ New Mexico, Sch Med, Translat Informat Div, Albuquerque, NM 87131 USA
[5] Univ Kragujevac, Dept Biol & Ecol, Fac Sci, Radoja Domanovica 12, Kragujevac 34000, Serbia
来源
关键词
Federated SPARQL query; Bioinformatics; Data integration; Ontologies; Data mining and information retrieval; RESOURCE; DATABASE; HCT-116; SYSTEM;
D O I
10.1186/s13326-017-0151-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. Results: PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. Conclusions: The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets
    Marija Djokic-Petrovic
    Vladimir Cvjetkovic
    Jeremy Yang
    Marko Zivanovic
    David J. Wild
    [J]. Journal of Biomedical Semantics, 8
  • [2] AptaMetrics: A Web-Based Aptamer Bioinformatics Platform
    Bair, Thomas
    Wertz, Julie S.
    Miller, Robert J.
    Schappet, James
    Giangrande, Paloma H.
    Thiel, William H.
    [J]. MOLECULAR THERAPY, 2016, 24 : S59 - S59
  • [3] A Web-Based Platform for Mining Pancreatic Expression Datasets
    Chelala, Claude
    Lemoine, Nicholas R.
    Hahn, Stephan A.
    Crnogorac-Jurcevic, Tatjana
    [J]. PANCREATOLOGY, 2009, 9 (04) : 340 - 343
  • [4] Exploration of web-based teaching platform on Biotechnological Pharmaceutics
    [J]. Yuan, Guangxin, 1600, Trade Science Inc, 126,Prasheel Park,Sanjay Raj Farm House,Nr. Saurashtra Unive, Rajkot, Gujarat, 360 005, India (10):
  • [5] A Web-based Platform for Dynamic Integration of Heterogeneous Data
    Tuan-Dat Trinh
    Wetz, Peter
    Do, Ba-Lam
    Anjomshoaa, Amin
    Kiesling, Elmar
    Tjoa, A. Min
    [J]. 16TH INTERNATIONAL CONFERENCE ON INFORMATION INTEGRATION AND WEB-BASED APPLICATIONS & SERVICES (IIWAS 2014), 2014, : 253 - 261
  • [6] NeuroCave: A web-based immersive visualization platform for exploring connectome datasets
    Keiriz, Johnson J. G.
    Zhan, Liang
    Ajilore, Olusola
    Leow, Alex D.
    Forbes, Angus G.
    [J]. NETWORK NEUROSCIENCE, 2018, 2 (03): : 344 - 361
  • [7] Web-based integration of enterprise software systems on net platform
    Ota, Martin
    Jelinek, Ivan
    [J]. TOOLS AND METHODS OF COMPETITIVE ENGINEERING Vols 1 and 2, 2004, : 1175 - 1176
  • [8] Toolchains for Interoperable BIM Workflows in a Web-Based Integration Platform
    Hagedorn, Philipp
    Block, Marlena
    Zentgraf, Sven
    Sigalov, Katharina
    Koenig, Markus
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (12):
  • [9] A Web-based Platform for OPC UA integration in IIoT environment
    Cavalieri, Salvatore
    Di Stefano, Damiano
    Salafia, Marco Giuseppe
    Scroppo, Marco Stefano
    [J]. 2017 22ND IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2017,
  • [10] Integration of OPC UA into a Web-based Platform to enhance interoperability
    Cavalieri, Salvatore
    Di Stefano, Damiano
    Salafia, Marco Giuseppe
    Scroppo, Marco Stefano
    [J]. 2017 IEEE 26TH INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2017, : 1206 - 1211