Transcriptome deep-sequencing and clustering of expressed isoforms from Favia corals

被引:18
|
作者
Mehr, Shaadi F. Pooyaei [1 ,2 ]
DeSalle, Rob [2 ]
Kao, Hung-Teh [3 ]
Narechania, Apurva [2 ]
Han, Zhou [4 ]
Tchernov, Dan [5 ]
Pieribone, Vincent [4 ]
Gruber, David F. [1 ,2 ,6 ]
机构
[1] CUNY, Grad Ctr, New York, NY 10065 USA
[2] Amer Museum Nat Hist, Sackler Inst Comparat Genom, New York, NY 10024 USA
[3] Brown Univ, Warren Alpert Med Sch, Dept Psychiat & Human Behav, Div Biol & Med, Providence, RI 02912 USA
[4] Yale Univ, John B Pierce Lab, New Haven, CT 06519 USA
[5] Univ Haifa, Leon H Charney Sch Marine Sci, Dept Marine Biol, IL-31905 Haifa, Israel
[6] CUNY, Baruch Coll, Dept Nat Sci, New York, NY 10010 USA
来源
BMC GENOMICS | 2013年 / 14卷
基金
美国国家科学基金会;
关键词
K-mer; Contig; Open reading frame; Fluorescent protein; Blast; Clustering; High-throughput sequencing; Illumina paired-end; Coral; GREEN FLUORESCENT PROTEINS; RNA-SEQ; SCLERACTINIAN CORALS; DNA-SEQUENCES; ALIGNMENT; GENOME; PHYLOGENY; EVOLUTION; RESPONSES; SELECTION;
D O I
10.1186/1471-2164-14-546
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Genomic and transcriptomic sequence data are essential tools for tackling ecological problems. Using an approach that combines next-generation sequencing, de novo transcriptome assembly, gene annotation and synthetic gene construction, we identify and cluster the protein families from Favia corals from the northern Red Sea. Results: We obtained 80 million 75 bp paired-end cDNA reads from two Favia adult samples collected at 65 m (Fav1, Fav2) on the Illumina GA platform, and generated two de novo assemblies using ABySS and CAP3. After removing redundancy and filtering out low quality reads, our transcriptome datasets contained 58,268 (Fav1) and 62,469 (Fav2) contigs longer than 100 bp, with N50 values of 1,665 bp and 1,439 bp, respectively. Using the proteome of the sea anemone Nematostella vectensis as a reference, we were able to annotate almost 20% of each dataset using reciprocal homology searches. Homologous clustering of these annotated transcripts allowed us to divide them into 7,186 (Fav1) and 6,862 (Fav2) homologous transcript clusters (E-value <= 2e(-30)). Functional annotation categories were assigned to homologous clusters using the functional annotation of Nematostella vectensis. General annotation of the assembled transcripts was improved 1-3% using the Acropora digitifera proteome. In addition, we screened these transcript isoform clusters for fluorescent proteins (FPs) homologs and identified seven potential FP homologs in Fav1, and four in Fav2. These transcripts were validated as bona fide FP transcripts via robust fluorescence heterologous expression. Annotation of the assembled contigs revealed that 1.34% and 1.61% (in Fav1 and Fav2, respectively) of the total assembled contigs likely originated from the corals' algal symbiont, Symbiodinium spp. Conclusions: Here we present a study to identify the homologous transcript isoform clusters from the transcriptome of Favia corals using a far-related reference proteome. Furthermore, the symbiont-derived transcripts were isolated from the datasets and their contribution quantified. This is the first annotated transcriptome of the genus Favia, a major increase in genomics resources available in this important family of corals.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Transcriptome profiling of early developing cotton fiber by deep-sequencing reveals significantly differential expression of genes in a fuzzless/lintless mutant
    Wang, Qin Qin
    Liu, Fei
    Chen, Xu Sheng
    Ma, Xiao Jie
    Zeng, Hou Qing
    Yang, Zhi Min
    GENOMICS, 2010, 96 (06) : 369 - 376
  • [22] Transcriptome sequencing and comparative analysis of differentially-expressed isoforms in the roots of Halogeton glomeratus under salt stress
    Yao, Lirong
    Wang, Juncheng
    Li, Baochun
    Meng, Yaxiong
    Ma, Xiaole
    Si, Erjing
    Ren, Panrong
    Yang, Ke
    Shang, Xunwu
    Wang, Huajun
    GENE, 2018, 646 : 159 - 168
  • [23] Deep-sequencing transcriptome analysis of field-grown Medicago sativa L. crown buds acclimated to freezing stress
    Song, Lili
    Jiang, Lin
    Chen, Yue
    Shu, Yongjun
    Bai, Yan
    Guo, Changhong
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2016, 16 (05) : 495 - 511
  • [24] Deep-sequencing transcriptome analysis of field-grown Medicago sativa L. crown buds acclimated to freezing stress
    Lili Song
    Lin Jiang
    Yue Chen
    Yongjun Shu
    Yan Bai
    Changhong Guo
    Functional & Integrative Genomics, 2016, 16 : 495 - 511
  • [25] Detection of Viruses in Sweetpotato from Honduras and Guatemala Augmented by Deep-Sequencing of Small-RNAs
    Kashif, M.
    Pietila, S.
    Artola, K.
    Jones, R. A. C.
    Tugume, A. K.
    Makinen, V.
    Valkonen, J. P. T.
    PLANT DISEASE, 2012, 96 (10) : 1430 - 1437
  • [26] Analysis of insertion-deletion from deep-sequencing data: software evaluation for optimal detection
    Neuman, Joseph A.
    Isakov, Ofer
    Shomron, Noam
    BRIEFINGS IN BIOINFORMATICS, 2013, 14 (01) : 46 - 55
  • [27] Transmission Bottleneck Size Estimation from Pathogen Deep-Sequencing Data, with an Application to Human Influenza A Virus
    Leonard, Ashley Sobel
    Weissman, Daniel B.
    Greenbaum, Benjamin
    Ghedin, Elodie
    Koelle, Katia
    JOURNAL OF VIROLOGY, 2017, 91 (14)
  • [28] Marine microbial community structure assessed from combined metagenomic analysis and ribosomal amplicon deep-sequencing
    Genitsaris, Savvas
    Monchy, Sebastien
    Denonfoux, Jeremie
    Ferreira, Stephanie
    Kormas, Konstantinos Ar.
    Sime-Ngando, Telesphore
    Viscogliosi, Eric
    Christaki, Urania
    MARINE BIOLOGY RESEARCH, 2016, 12 (01) : 30 - 42
  • [29] Identifying and quantifying isoforms from accurate full-length transcriptome sequencing reads with Mandalorion
    Roger Volden
    Kayla D. Schimke
    Ashley Byrne
    Danilo Dubocanin
    Matthew Adams
    Christopher Vollmers
    Genome Biology, 24
  • [30] Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data
    Jorge Duitama
    Pramod K Srivastava
    Ion I Măndoiu
    BMC Genomics, 13