Transcriptome deep-sequencing and clustering of expressed isoforms from Favia corals

被引:18
|
作者
Mehr, Shaadi F. Pooyaei [1 ,2 ]
DeSalle, Rob [2 ]
Kao, Hung-Teh [3 ]
Narechania, Apurva [2 ]
Han, Zhou [4 ]
Tchernov, Dan [5 ]
Pieribone, Vincent [4 ]
Gruber, David F. [1 ,2 ,6 ]
机构
[1] CUNY, Grad Ctr, New York, NY 10065 USA
[2] Amer Museum Nat Hist, Sackler Inst Comparat Genom, New York, NY 10024 USA
[3] Brown Univ, Warren Alpert Med Sch, Dept Psychiat & Human Behav, Div Biol & Med, Providence, RI 02912 USA
[4] Yale Univ, John B Pierce Lab, New Haven, CT 06519 USA
[5] Univ Haifa, Leon H Charney Sch Marine Sci, Dept Marine Biol, IL-31905 Haifa, Israel
[6] CUNY, Baruch Coll, Dept Nat Sci, New York, NY 10010 USA
来源
BMC GENOMICS | 2013年 / 14卷
基金
美国国家科学基金会;
关键词
K-mer; Contig; Open reading frame; Fluorescent protein; Blast; Clustering; High-throughput sequencing; Illumina paired-end; Coral; GREEN FLUORESCENT PROTEINS; RNA-SEQ; SCLERACTINIAN CORALS; DNA-SEQUENCES; ALIGNMENT; GENOME; PHYLOGENY; EVOLUTION; RESPONSES; SELECTION;
D O I
10.1186/1471-2164-14-546
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Genomic and transcriptomic sequence data are essential tools for tackling ecological problems. Using an approach that combines next-generation sequencing, de novo transcriptome assembly, gene annotation and synthetic gene construction, we identify and cluster the protein families from Favia corals from the northern Red Sea. Results: We obtained 80 million 75 bp paired-end cDNA reads from two Favia adult samples collected at 65 m (Fav1, Fav2) on the Illumina GA platform, and generated two de novo assemblies using ABySS and CAP3. After removing redundancy and filtering out low quality reads, our transcriptome datasets contained 58,268 (Fav1) and 62,469 (Fav2) contigs longer than 100 bp, with N50 values of 1,665 bp and 1,439 bp, respectively. Using the proteome of the sea anemone Nematostella vectensis as a reference, we were able to annotate almost 20% of each dataset using reciprocal homology searches. Homologous clustering of these annotated transcripts allowed us to divide them into 7,186 (Fav1) and 6,862 (Fav2) homologous transcript clusters (E-value <= 2e(-30)). Functional annotation categories were assigned to homologous clusters using the functional annotation of Nematostella vectensis. General annotation of the assembled transcripts was improved 1-3% using the Acropora digitifera proteome. In addition, we screened these transcript isoform clusters for fluorescent proteins (FPs) homologs and identified seven potential FP homologs in Fav1, and four in Fav2. These transcripts were validated as bona fide FP transcripts via robust fluorescence heterologous expression. Annotation of the assembled contigs revealed that 1.34% and 1.61% (in Fav1 and Fav2, respectively) of the total assembled contigs likely originated from the corals' algal symbiont, Symbiodinium spp. Conclusions: Here we present a study to identify the homologous transcript isoform clusters from the transcriptome of Favia corals using a far-related reference proteome. Furthermore, the symbiont-derived transcripts were isolated from the datasets and their contribution quantified. This is the first annotated transcriptome of the genus Favia, a major increase in genomics resources available in this important family of corals.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Towards accurate detection and genotyping of expressed variants from whole transcriptome sequencing data
    Duitama, Jorge
    Srivastava, Pramod K.
    Mandoiu, Ion I.
    BMC GENOMICS, 2012, 13
  • [32] Identifying and quantifying isoforms from accurate full-length transcriptome sequencing reads with Mandalorion
    Volden, Roger
    Schimke, Kayla D.
    Byrne, Ashley
    Dubocanin, Danilo
    Adams, Matthew
    Vollmers, Christopher
    GENOME BIOLOGY, 2023, 24 (01)
  • [33] Identification of garlic-infecting leek yellow stripe virus through deep-sequencing analyses from Iran
    Entezari A.
    Mehrvar M.
    Zakiaghl M.
    VirusDisease, 2021, 32 (3) : 595 - 600
  • [34] Deep sequencing of the tobacco mitochondrial transcriptome reveals expressed ORFs and numerous editing sites outside coding regions
    Benjamin T Grimes
    Awa K Sisay
    Hyrum D Carroll
    A Bruce Cahoon
    BMC Genomics, 15
  • [35] Deep sequencing of the tobacco mitochondrial transcriptome reveals expressed ORFs and numerous editing sites outside coding regions
    Grimes, Benjamin T.
    Sisay, Awa K.
    Carroll, Hyrum D.
    Cahoon, A. Bruce
    BMC GENOMICS, 2014, 15
  • [36] deepBase v3.0: expression atlas and interactive analysis of ncRNAs from thousands of deep-sequencing data
    Xie, Fangzhou
    Liu, Shurong
    Wang, Junhao
    Xuan, Jiajia
    Zhang, Xiaoqin
    Qu, Lianghu
    Zheng, Lingling
    Yang, Jianhua
    NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) : D877 - D883
  • [37] Identification and characterization of microRNAs expressed in human breast cancer chemo-resistant MCF-7/Adr cells by Solexa deep-sequencing technology
    Xu, Pengfei
    Wang, Luyu
    Huang, Lei
    Li, Wenqu
    Lv, Shanshan
    Lv, Mingming
    Ma, Jingjing
    Zhou, Qian
    Wu, Xiaowei
    Fu, Ziyi
    Lu, Cheng
    Yin, Hong
    BIOMEDICINE & PHARMACOTHERAPY, 2015, 75 : 173 - 178
  • [38] Revealing stable processing products from ribosome-associated small RNAs by deep-sequencing data analysis
    Zywicki, Marek
    Bakowska-Zywicka, Kamilla
    Polacek, Norbert
    NUCLEIC ACIDS RESEARCH, 2012, 40 (09) : 4013 - 4024
  • [39] Deep-Sea, Deep-Sequencing: Metabarcoding Extracellular DNA from Sediments of Marine Canyons (vol 10, e0139633, 2015)
    Guardiola, Magdalena
    Uriz, Maria Jesus
    Taberlet, Pierre
    Coissac, Eric
    Wangensteen, Owen Simon
    Turon, Xavier
    PLOS ONE, 2016, 11 (04):
  • [40] Transcriptome profiling of biliary atresia from new born infants by deep sequencing
    Jie Xiao
    Su-yun Xia
    Yun Xia
    Qiang Xia
    Xiang-rui Wang
    Molecular Biology Reports, 2014, 41 : 8063 - 8069