Consensus assessment of the contamination level of publicly available cyanobacterial genomes

被引:34
|
作者
Cornet, Luc [1 ,2 ]
Meunier, Loic [1 ]
Van Vlierberghe, Mick [1 ]
Leonard, Raphael R. [1 ,3 ]
Durieu, Benoit [4 ]
Lara, Yannick [4 ]
Misztak, Agnieszka [1 ,5 ]
Sirjacobs, Damien [1 ]
Javaux, Emmanuelle J. [2 ]
Philippe, Herve [6 ]
Wilmotte, Annick [4 ]
Baurain, Denis [1 ]
机构
[1] Univ Liege, InBioS PhytoSYST, Eukaryot Phylogen, Liege, Belgium
[2] Univ Liege, UR Geol Palaeobiogeol Palaeobot Palaeopalynol, Liege, Belgium
[3] Univ Liege, InBioS CIP, Macromol Crystallog, Liege, Belgium
[4] Univ Liege, Ctr Prot Engn, InBioS CIP, Liege, Belgium
[5] Intercollegiate Fac Biotechnol UG MUG, Gdansk, Poland
[6] Ctr Biodivers Theory & Modelling, Moulis, France
来源
PLOS ONE | 2018年 / 13卷 / 07期
基金
欧洲研究理事会;
关键词
HORIZONTAL GENE-TRANSFER; MULTIPLE SEQUENCE ALIGNMENT; BACTERIAL DIVERSITY; EVOLUTION; ORIGIN; TOOL; QUANTIFICATION; ANNOTATION; CONSISTENT; QUALITY;
D O I
10.1371/journal.pone.0200323
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Publicly available genomes are crucial for phylogenetic and metagenomic studies, in which contaminating sequences can be the cause of major problems. This issue is expected to be especially important for Cyanobacteria because axenic strains are notoriously difficult to obtain and keep in culture. Yet, despite their great scientific interest, no data are currently available concerning the quality of publicly available cyanobacterial genomes. As reliably detecting contaminants is a complex task, we designed a pipeline combining six methods in a consensus strategy to assess the contamination level of 440 genome assemblies of Cyanobacteria. Two methods are based on published reference databases of ribosomal genes (SSU rRNA 16S and ribosomal proteins), one is indirectly based on a reference database of marker genes (CheckM), and three are based on complete genome analysis. Among those genome-wide methods, Kraken and DIAMOND blastx share the same reference database that we derived from Ensembl Bacteria, whereas CONCOCT does not require any reference database, instead relying on differences in DNA tetramer frequencies. Given that all the six methods appear to have their own strengths and limitations, we used the consensus of their rankings to infer that >5% of cyanobacterial genome assemblies are highly contaminated by foreign DNA (i.e., contaminants were detected by 5 or 6 methods). Our results will help researchers to check the quality of publicly available genomic data before use in their own analyses. Moreover, we argue that journals should make mandatory the submission of raw read data along with genome assemblies in order to facilitate the detection of contaminants in sequence databases.
引用
收藏
页数:26
相关论文
共 50 条
  • [41] Near-infrared spectroscopic assessment of contamination level of sewage
    Inagaki, Tetsuya
    Shinoda, Yukari
    Miyazawa, Mitsuhiro
    Takamura, Hitoshi
    Tsuchikawa, Satoru
    [J]. WATER SCIENCE AND TECHNOLOGY, 2010, 61 (08) : 1957 - 1963
  • [42] Transforming Digital Learning and Assessment: A Guide to Available and Emerging Practices and Building Institutional Consensus
    Hays, Lauren
    [J]. TEACHING & LEARNING INQUIRY-THE ISSOTL JOURNAL, 2021, 9 (02):
  • [43] A high level of scientific evidence is available to guide treatment of primary shoulder stiffness: The SIAGASCOT consensus
    Cucchi, Davide
    Di Giacomo, Giovanni
    Compagnoni, Riccardo
    Castricini, Roberto
    Formigoni, Chiara
    Radici, Mattia
    Melis, Barbara
    Brindisino, Fabrizio
    De Giorgi, Silvana
    De Vita, Andrea
    Lisai, Andrea
    Mangiavini, Laura
    Candela, Vincenzo
    Carrozzo, Alessandro
    Pannone, Antonello
    Menon, Alessandra
    Giudici, Luca Dei
    Klumpp, Raymond
    Padua, Roberto
    Carnevale, Arianna
    Rosa, Francesco
    Marmotti, Antongiulio
    Peretti, Giuseppe M.
    Berruto, Massimo
    Milano, Giuseppe
    Randelli, Pietro
    Bonaspetti, Giovanni
    De Girolamo, Laura
    [J]. KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2024, 32 (01) : 37 - 46
  • [44] Holistic in silico developability assessment of novel classes of small proteins using publicly available sequence-based predictors
    Pais, Daniel A. M.
    Mayer, Jan-Peter A.
    Felderer, Karin
    Batalha, Maria B.
    Eichner, Timo
    Santos, Sofia T.
    Kumar, Raman
    Silva, Sandra D.
    Kaufmann, Hitto
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2024, 38 (01)
  • [45] Assessment and Comparison of Competitiveness Between Clinical Trial Protocols: A Simulation Approach Using Publicly Available Registered Clinical Trials
    Wilson, Andrew
    Krikov, Sergey
    Parker, Craig
    Kamauu, Allise
    Paredes, Rebekah K.
    Kamauu, Aaron W. C.
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2016, 25 : 607 - 607
  • [46] Significance assessment of mutations in 944 MDS patients using publicly available variant databases and mutation impact prediction software
    Nadarajah, Niroshan
    Meggendorfer, Manja
    Kern, Wolfgang
    Haferlach, Claudia
    Haferlach, Torsten
    [J]. CANCER RESEARCH, 2016, 76
  • [47] Efficacy of self-management mobile applications for patients with breathlessness: Systematic review and quality assessment of publicly available applications
    Sunjaya, Anthony Paulo
    Sengupta, Agnivo
    Martin, Allison
    Di Tanna, Gian Luca
    Jenkins, Christine
    [J]. RESPIRATORY MEDICINE, 2022, 201
  • [48] Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data
    Liu, Andrew
    Kang, Hyun Min
    [J]. PLOS GENETICS, 2022, 18 (01):
  • [49] RISK ASSESSMENT OF HEAVY METAL AND MICROBIAL CONTAMINATION IN COMMERCIALLY AVAILABLE SALAD VEGETABLES OF FAISALABAD, PAKISTAN
    Ejaz, Farah
    Nawav, Muhammad Farrakh
    Dasti, Zulfiqar Ahmad
    Gul, Sadaf
    Islam, Umer
    Waqar, Muhammad
    [J]. PAKISTAN JOURNAL OF BOTANY, 2020, 52 (04) : 1397 - 1403
  • [50] Assessment and abatement of the soil oil-contamination level in industrial areas
    Bykova, M. V.
    Pashkevich, M. A.
    Matveeva, V. A.
    Sverchkov, I. P.
    [J]. TOPICAL ISSUES OF RATIONAL USE OF NATURAL RESOURCES, 2019, : 347 - 359