Whole-genome automated assembly pipeline for Chlamydia trachomatis strains from reference, in vitro and clinical samples using the integrated CtGAP pipeline

被引:0
|
作者
Olagoke, Olusola [1 ,2 ]
Aziz, Ammar [3 ]
Zhu, Lucile H. [4 ,5 ]
Read, Timothy D. [6 ,7 ]
Dean, Deborah [1 ,2 ,4 ,5 ,8 ,9 ,10 ]
机构
[1] Univ Calif San Francisco, Sch Med, Dept Med, Div Infect Dis & Global Hlth, 550 16th St,4th Floor Mission Hall, San Francisco, CA 94158 USA
[2] Univ Calif San Francisco, Sch Med, Dept Pediat, Div Infect Dis & Global Hlth, 550 16th St,4th Floor Mission Hall, San Francisco, CA 94158 USA
[3] Victorian Infect Dis Reference Lab, 792 Elizabeth St, Melbourne, Vic 3000, Australia
[4] Univ Calif San Francisco, Dept Bioengn, 306 Stanley Hall, Berkeley, CA 94720 USA
[5] Berkeley Sch Engn, 306 Stanley Hall, Berkeley, CA 94720 USA
[6] Emory Univ, Sch Med, Dept Med, Div Infect Dis, 100 Woodruff Circle, Atlanta, GA 30322 USA
[7] Emory Univ, Sch Med, Dept Genet, Div Infect Dis, 100 Woodruff Circle, Atlanta, GA 30322 USA
[8] Univ Calif San Francisco, Bixby Ctr Global Reprod Hlth, 1001 Potrero Ave, San Francisco, CA 94110 USA
[9] Univ Calif San Francisco, Benioff Ctr Microbiome Med, 513 Parnassus Ave,S357, San Francisco, CA 94143 USA
[10] Univ Calif San Francisco, Inst Global Hlth Sci, 5501 6th St,3rd Floor Mission Hall, San Francisco, CA 94158 USA
基金
美国国家卫生研究院;
关键词
ALIGNMENT;
D O I
10.1093/nargab/lqae187
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Whole genome sequencing (WGS) is pivotal for the molecular characterization of Chlamydia trachomatis (Ct)-the leading bacterial cause of sexually transmitted infections and infectious blindness worldwide. Ct WGS can inform epidemiologic, public health and outbreak investigations of these human-restricted pathogens. However, challenges persist in generating high-quality genomes for downstream analyses given its obligate intracellular nature and difficulty with in vitro propagation. No single tool exists for the entirety of Ct genome assembly, necessitating the adaptation of multiple programs with varying success. Compounding this issue is the absence of reliable Ct reference strain genomes. We, therefore, developed CtGAP-Chlamydia trachomatisGenome Assembly Pipeline-as an integrated 'one-stop-shop' pipeline for assembly and characterization of Ct genome sequencing data from various sources including isolates, in vitro samples, clinical swabs and urine. CtGAP, written in Snakemake, enables read quality statistics output, adapter and quality trimming, host read removal, de novo and reference-guided assembly, contig scaffolding, selective ompA, multi-locus-sequence and plasmid typing, phylogenetic tree construction, and recombinant genome identification. Twenty Ct reference genomes were also generated. Successfully validated on a diverse collection of 363 samples containing Ct, CtGAP represents a novel pipeline requiring minimal bioinformatics expertise with easy adaptation for use with other bacterial species.
引用
收藏
页数:10
相关论文
共 21 条
  • [21] First molecular characterization of Escherichia coli O157:H7 isolates from clinical samples in Paraguay using whole-genome sequencing
    Weiler, Natalie
    Martinez, Lucia Jazmin
    Campos, Josefina
    Poklepovich, Tomas
    Orrego, Maria Veronica
    Ortiz, Flavia
    Alvarez, Mercedes
    Putzolu, Karina
    Zolezzi, Gisela
    Miliwebsky, Elisabeth
    Chinen, Isabel
    REVISTA ARGENTINA DE MICROBIOLOGIA, 2023, 55 (02): : 111 - 119