Big data in genomic research for big questions with examples from covid-19 and other zoonoses

被引:1
|
作者
Wassenaar, Trudy M. [1 ]
Ussery, David W. [2 ]
Rosel, Adriana Cabal [3 ]
机构
[1] Mol Microbiol & Genom Consultants, Tannenstr 7, D-55576 Zotzenheim, Germany
[2] Univ Arkansas Med Sci, Dept Biomed Informat, 4301 W Markham St, Little Rock, AR 72205 USA
[3] Austrian Agcy Hlth & Food Safety, Inst Med Microbiol & Hyg, Div Publ Hlth, Wahringerstr 25a, A-1096 Vienna, Austria
基金
美国国家科学基金会;
关键词
omics; genomics; zoonoses; COVID-19; Salmonella; scientific publishing; big data; SALMONELLA-ENTERICA; MICROBIOME; COLI;
D O I
10.1093/jambio/lxac055
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Omics research inevitably involves the collection and analysis of big data, which can only be handled by automated approaches. Here we point out that the analysis of big data in the field of genomics dictates certain requirements, such as specialized software, quality control of input data, and simplification for visualization of the results. The latter results in a loss of information, as is exemplified for phylogenetic trees. Clear communication of big data analyses can be enhanced by novel visualization strategies. The interpretation of findings is sometimes hampered when dedicated analytical tools are not fully understood by microbiologists, while the researchers performing these analyses may not have a full overview of the biology of the microbes under study. These issues are illustrated here, using SARS-Cov-2 and Salmonella enterica as zoonotic examples. Whereas in scientific communications jargon should be avoided or explained, nomenclature to group similar organisms and distinguish these from more distant relatives is not only essential, but also influences the interpretation of results. Unfortunately, changes in taxonomically accepted names are now so frequent that they hamper rather than assist research, as is illustrated with difficulties of microbiome studies. Nomenclature to group viral isolates, as is done for SARS-Cov2, is also not without difficulties. Some weaknesses in current omics research stem from poor quality of data or biased databases, and problems can be magnified by machine learning approaches. Moreover, the overall opus of scientific publications can now be considered "big data", as is illustrated by the avalanche of COVID-19-related publications. The peer-review model of scientific publishing is only barely coping with this novel situation, resulting in retractions and the publication of bogus works. The avalanche of scientific publications that originated from the current pandemic can obstruct literature searches, and this will unfortunately continue over time.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Applications of Big Data Analytics to Control COVID-19 Pandemic
    Alsunaidi, Shikah J.
    Almuhaideb, Abdullah M.
    Ibrahim, Nehad M.
    Shaikh, Fatema S.
    Alqudaihi, Kawther S.
    Alhaidari, Fahd A.
    Khan, Irfan Ullah
    Aslam, Nida
    Alshahrani, Mohammed S.
    SENSORS, 2021, 21 (07)
  • [32] Social license for the use of big data in the COVID-19 era
    Shaw, James A.
    Sethi, Nayha
    Cassel, Christine K.
    NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [33] A Big Data Approach for Healthcare Analysis During Covid-19
    Vishwakarma, Santosh K.
    Gupta, Nirmal K.
    Sharma, Prakash C.
    Jain, Ashish
    SMART SYSTEMS: INNOVATIONS IN COMPUTING (SSIC 2021), 2022, 235 : 459 - 465
  • [34] Revisiting "big questions" of public administration after COVID-19: a systematic review
    Liu, Ting-An-Xu
    Wightman, G. Breck
    Lee, Euipyo
    Hunter, Jordan
    ASIA PACIFIC JOURNAL OF PUBLIC ADMINISTRATION, 2021, 43 (03) : 131 - 168
  • [35] Big Data Analysis of Media Reports Related to COVID-19
    Jung, Ji-Hee
    Shin, Jae-Ik
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2020, 17 (16) : 1 - 11
  • [36] Public Policy, Big Data, Network Theory and COVID-19
    Isaza Villamizar, Emiliano
    Gonzalez-Casabianca, Felipe
    Herrera, Santiago
    Rodriguez-Barraquer, Tomas
    Angel, Andres
    Corredor, Vladimir
    Feged-Rivadeneira, Alejandro
    DESAFIOS, 2020, 32 (02):
  • [37] Meaningful Big Data Integration for a Global COVID-19 Strategy
    Costa, Joao Pita
    Grobelnik, Marko
    Fuart, Flavio
    Stopar, Luka
    Epelde, Gorka
    Fischaber, Scott
    Poliwoda, Piotr
    Rankin, Debbie
    Wallace, Jonathan
    Black, Michaela
    Bond, Raymond
    Mulvenna, Maurice
    Weston, Dale
    Carlin, Paul
    Bilbao, Roberto
    Nikolic, Gorana
    Shi, Xi
    De Moor, Bart
    Pikkarainen, Minna
    Paakkonen, Jarmo
    Staines, Anthony
    Connolly, Regina
    Davis, Paul
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2020, 15 (04) : 51 - 61
  • [38] Big Data Analytics in Healthcare: COVID-19 Indonesia Clustering
    Andry, Johanes Fernandes
    Rembulan, Glisina Dwinoor
    Salim, Edwin Leonard
    Fatmawati, Endang
    Tannady, Hendy
    JOURNAL OF POPULATION THERAPEUTICS AND CLINICAL PHARMACOLOGY, 2023, 30 (04): : E290 - E300
  • [39] Big data analysis for Covid-19 in hospital information systems
    Ying, Xinpa
    Peng, Haiyang
    Xie, Jun
    PLOS ONE, 2024, 19 (05):
  • [40] Social license for the use of big data in the COVID-19 era
    James A. Shaw
    Nayha Sethi
    Christine K. Cassel
    npj Digital Medicine, 3