Big data in genomic research for big questions with examples from covid-19 and other zoonoses

被引:1
|
作者
Wassenaar, Trudy M. [1 ]
Ussery, David W. [2 ]
Rosel, Adriana Cabal [3 ]
机构
[1] Mol Microbiol & Genom Consultants, Tannenstr 7, D-55576 Zotzenheim, Germany
[2] Univ Arkansas Med Sci, Dept Biomed Informat, 4301 W Markham St, Little Rock, AR 72205 USA
[3] Austrian Agcy Hlth & Food Safety, Inst Med Microbiol & Hyg, Div Publ Hlth, Wahringerstr 25a, A-1096 Vienna, Austria
基金
美国国家科学基金会;
关键词
omics; genomics; zoonoses; COVID-19; Salmonella; scientific publishing; big data; SALMONELLA-ENTERICA; MICROBIOME; COLI;
D O I
10.1093/jambio/lxac055
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Omics research inevitably involves the collection and analysis of big data, which can only be handled by automated approaches. Here we point out that the analysis of big data in the field of genomics dictates certain requirements, such as specialized software, quality control of input data, and simplification for visualization of the results. The latter results in a loss of information, as is exemplified for phylogenetic trees. Clear communication of big data analyses can be enhanced by novel visualization strategies. The interpretation of findings is sometimes hampered when dedicated analytical tools are not fully understood by microbiologists, while the researchers performing these analyses may not have a full overview of the biology of the microbes under study. These issues are illustrated here, using SARS-Cov-2 and Salmonella enterica as zoonotic examples. Whereas in scientific communications jargon should be avoided or explained, nomenclature to group similar organisms and distinguish these from more distant relatives is not only essential, but also influences the interpretation of results. Unfortunately, changes in taxonomically accepted names are now so frequent that they hamper rather than assist research, as is illustrated with difficulties of microbiome studies. Nomenclature to group viral isolates, as is done for SARS-Cov2, is also not without difficulties. Some weaknesses in current omics research stem from poor quality of data or biased databases, and problems can be magnified by machine learning approaches. Moreover, the overall opus of scientific publications can now be considered "big data", as is illustrated by the avalanche of COVID-19-related publications. The peer-review model of scientific publishing is only barely coping with this novel situation, resulting in retractions and the publication of bogus works. The avalanche of scientific publications that originated from the current pandemic can obstruct literature searches, and this will unfortunately continue over time.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Big Data Research in Fighting COVID-19: Contributions and Techniques
    Riswantini, Dianadewi
    Nugraheni, Ekasari
    Arisal, Andria
    Khotimah, Purnomo Husnul
    Munandar, Devi
    Suwarningsih, Wiwin
    BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (03)
  • [2] Covid-19 Imaging Tools: How Big Data is Big?
    KC Santosh
    Sourodip Ghosh
    Journal of Medical Systems, 2021, 45
  • [3] Covid-19 Imaging Tools: How Big Data is Big?
    Santosh, K. C.
    Ghosh, Sourodip
    JOURNAL OF MEDICAL SYSTEMS, 2021, 45 (07)
  • [4] Big Data Science on COVID-19 Data
    Leung, Carson K.
    Chen, Yubo
    Shang, Siyuan
    Deng, Deyu
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2020), 2020, : 14 - 21
  • [5] Symposium: COVID-19 and Big Questions of Public Administration
    Perry, James
    Lam, Wai Fung
    ASIA PACIFIC JOURNAL OF PUBLIC ADMINISTRATION, 2021, 43 (03) : 130 - 130
  • [6] Testing big data in a big crisis: Nowcasting under Covid-19
    Barbaglia, Luca
    Frattarolo, Lorenzo
    Onorante, Luca
    Pericoli, Filippo Maria
    Ratto, Marco
    Pezzoli, Luca Tiozzo
    INTERNATIONAL JOURNAL OF FORECASTING, 2023, 39 (04) : 1548 - 1563
  • [7] Big data and cutaneous manifestations of COVID-19
    Grant-Kels, Jane M.
    Sloan, Brett
    Kantor, Jonathan
    Elston, Dirk M.
    JOURNAL OF THE AMERICAN ACADEMY OF DERMATOLOGY, 2020, 83 (02) : 365 - 366
  • [8] COVID-19: Challenges to GIS with Big Data
    Zhou, Chenghu
    Su, Fenzhen
    Pei, Tao
    Zhang, An
    Du, Yunyan
    Luo, Bin
    Cao, Zhidong
    Wang, Juanle
    Yuan, Wen
    Zhu, Yunqiang
    Song, Ci
    Chen, Jie
    Xu, Jun
    Li, Fujia
    Ma, Ting
    Jiang, Lili
    Yan, Fengqin
    Yi, Jiawei
    Hu, Yunfeng
    Liao, Yilan
    Xiao, Han
    GEOGRAPHY AND SUSTAINABILITY, 2020, 1 (01) : 77 - 87
  • [9] Pandemic Ethics-8 Big Questions of COVID-19
    Lykkeskov, Anne
    ETHICAL THEORY AND MORAL PRACTICE, 2021, 24 (01) : 423 - 425
  • [10] The intersection of big data and epidemiology for epidemiologic research: The impact of the COVID-19 pandemic
    Tang, Chunlei
    Plasek, Joseph M.
    Zhang, Suhua
    Xiong, Yun
    Zhu, Yangyong
    Ma, Jing
    Zhou, L., I
    Bates, David W.
    INTERNATIONAL JOURNAL FOR QUALITY IN HEALTH CARE, 2021, 33 (03) : 1 - 2