Reactome graph database: Efficient access to complex pathway data

被引:168
|
作者
Fabregat, Antonio [1 ,2 ]
Korninger, Florian [1 ]
Viteri, Guilherme [1 ]
Sidiropoulos, Konstantinos [1 ]
Marin-Garcia, Pablo [3 ,4 ]
Ping, Peipei [5 ,6 ]
Wu, Guanming [7 ]
Stein, Lincoln [8 ,9 ]
D'Eustachio, Peter [10 ]
Hermjakob, Henning [1 ,11 ]
机构
[1] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Hinxton, England
[2] Open Targets, Wellcome Genome Campus, Hinxton, England
[3] Univ Valencia, Fdn Invest INCLIVA, Valencia, Spain
[4] Inst Med Genom, Valencia, Spain
[5] Univ Calif Los Angeles, NIH BD2K Ctr Excellence, Los Angeles, CA USA
[6] Univ Calif Los Angeles, Dept Physiol Med & Bioinformat, Los Angeles, CA USA
[7] Oregon Hlth & Sci Univ, Portland, OR 97201 USA
[8] Ontario Inst Canc Res, Toronto, ON, Canada
[9] Univ Toronto, Dept Mol Genet, Toronto, ON, Canada
[10] NYU, Langone Med Ctr, New York, NY USA
[11] Natl Ctr Prot Sci, Beijing Inst Radiat Med, Beijing Proteome Res Ctr, State Key Lab Prote, Beijing, Peoples R China
基金
美国国家卫生研究院;
关键词
D O I
10.1371/journal.pcbi.1005968
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Interleukins and their signaling pathways in the Reactome biological pathway database
    Jupe, Steve
    Ray, Keith
    Roca, Corina Duenas
    Varusai, Thawfeek
    Shamovsky, Veronica
    Stein, Lincoln
    D'Eustachio, Peter
    Hermjakob, Henning
    JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2018, 141 (04) : 1411 - 1416
  • [2] Reactome - Pathway Context and Visualisation for Omics Data
    Hermjakob, Henning
    BIOPHYSICAL JOURNAL, 2019, 116 (03) : 329A - 329A
  • [3] The annotation of the asparagine N-linked glycosylation pathway in the Reactome database
    Marco Dall'Olio, Giovanni
    Jassal, Bijay
    Montanucci, Ludovica
    Gagneux, Pascal
    Bertranpetit, Jaume
    Laayouni, Hafid
    GLYCOBIOLOGY, 2011, 21 (11) : 1395 - 1400
  • [4] Reactome pathway analysis to enrich biological discovery in proteomics data sets
    Haw, Robin
    Hermjakob, Henning
    D'Eustachio, Peter
    Stein, Lincoln
    PROTEOMICS, 2011, 11 (18) : 3598 - 3613
  • [5] Diagonal replication on grid for efficient access of data in distributed database systems
    Deris, MM
    Bakar, N
    Rabiei, M
    Suzuri, HM
    COMPUTATIONAL SCIENCE - ICCS 2004, PT 3, PROCEEDINGS, 2004, 3038 : 379 - 387
  • [6] Plant Reactome Knowledgebase: empowering plant pathway exploration and OMICS data analysis
    Gupta, Parul
    Elser, Justin
    Hooks, Elizabeth
    D'Eustachio, Peter
    Jaiswal, Pankaj
    Naithani, Sushma
    NUCLEIC ACIDS RESEARCH, 2023, : D1538 - D1547
  • [7] Combined Graph/Relational Database Management System for Calculated Chemical Reaction Pathway Data
    Gimadiev, Timur
    Nugmanov, Ramil
    Batyrshin, Dinar
    Madzhidov, Timur
    Maeda, Satoshi
    Sidorov, Pavel
    Varnek, Alexandre
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (02) : 554 - 559
  • [8] Associative Graph Data Structures with an Efficient Access via AVB plus trees
    Horzyk, Adrian
    2018 11TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI), 2018, : 169 - 175
  • [9] The IntAct database: efficient access to fine-grained molecular interaction data
    del Toro, Noemi
    Shrivastava, Anjali
    Ragueneau, Eliot
    Meldal, Birgit
    Combe, Colin
    Barrera, Elisabet
    Perfetto, Livia
    How, Karyn
    Ratan, Prashansa
    Shirodkar, Gautam
    Lu, Odilia
    Meszaros, Balint
    Watkins, Xavier
    Pundir, Sangya
    Licata, Luana
    Iannuccelli, Marta
    Pellegrini, Matteo
    Martin, Maria Jesus
    Panni, Simona
    Duesbury, Margaret
    Vallet, Sylvain D.
    Rappsilber, Juri
    Ricard-Blum, Sylvie
    Cesareni, Gianni
    Salwinski, Lukasz
    Orchard, Sandra
    Porras, Pablo
    Panneerselvam, Kalpana
    Hermjakob, Henning
    NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D648 - D653
  • [10] Migration of Data from Relational Database to Graph Database
    Unal, Yelda
    Oguztuzun, Halit
    ICIST '18: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES, 2018,