HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting

被引:40
|
作者
Kim, Hyunsung John [1 ]
Pourmand, Nader [1 ]
机构
[1] Univ Calif Santa Cruz, Dept Biomol Engn, Baskin Sch Engn, Santa Cruz, CA 95064 USA
来源
PLOS ONE | 2013年 / 8卷 / 06期
关键词
STEM-CELL TRANSPLANTATION; HIGH-RESOLUTION HLA; HIGH-THROUGHPUT; GENE FUSIONS; GENERATION; CANCER; MHC; NOMENCLATURE; POPULATION; ALLELES;
D O I
10.1371/journal.pone.0067885
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correctly matching the HLA haplotypes of donor and recipient is essential to the success of allogenic hematopoietic stem cell transplantation. Current HLA typing methods rely on targeted testing of recognized antigens or sequences. Despite advances in Next Generation Sequencing, general high throughput transcriptome sequencing is currently underutilized for HLA haplotyping due to the central difficulty in aligning sequences within this highly variable region. Here we present the method, HLAforest, that can accurately predict HLA haplotype by hierarchically weighting reads and using an iterative, greedy, top down pruning technique. HLAforest correctly predicts >99% of allele group level (2 digit) haplotypes and 93% of peptide-level (4 digit) haplotypes of the most diverse HLA genes in simulations with read lengths and error rates modeling currently available sequencing technology. The method is very robust to sequencing error and can predict 99% of allele-group level haplotypes with substitution rates as high as 8.8%. When applied to data generated from a trio of cell lines, HLAforest corroborated PCR-based HLA haplotyping methods and accurately predicted 16/18 (89%) major class I genes for a daughter-father-mother trio at the peptide level. Major class II genes were predicted with 100% concordance between the daughter-father-mother trio. In fifty HapMap samples with paired end reads just 37 nucleotides long, HLAforest predicted 96.5% of allele group level HLA haplotypes correctly and 83% of peptide level haplotypes correctly. In sixteen RNAseq samples with limited coverage across HLA genes, HLAforest predicted 97.7% of allele group level haplotypes and 85% of peptide level haplotypes correctly.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] HLA Haplotyping from RNA-seq Data Using Hierarchical Read Weighting (vol 8, e67885, 2013)
    Kim, H. J.
    Pourmand, N.
    PLOS ONE, 2014, 9 (07):
  • [2] RNA-seq Read Simulator using SAM Template
    Lee, Sang-min
    Tak, Haesung
    Park, Kiejung
    Cho, Hwangue
    Lee, Dohoon
    2013 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2013,
  • [3] NASA GeneLab RNA-seq consensus pipeline: standardized processing of short-read RNA-seq data
    Overbey, Eliah G.
    Saravia-Butler, Amanda M.
    Zhang, Zhe
    Rathi, Komal S.
    Fogle, Homer
    da Silveira, Willian A.
    Barker, Richard J.
    Bass, Joseph J.
    Beheshti, Afshin
    Berrios, Daniel C.
    Blaber, Elizabeth A.
    Cekanaviciute, Egle
    Costa, Helio A.
    Davin, Laurence B.
    Fisch, Kathleen M.
    Gebre, Samrawit G.
    Geniza, Matthew
    Gilbert, Rachel
    Gilroy, Simon
    Hardiman, Gary
    Herranz, Raul
    Kidane, Yared H.
    Kruse, Colin P. S.
    Lee, Michael D.
    Liefeld, Ted
    Lewis, Norman G.
    McDonald, J. Tyson
    Meller, Robert
    Mishra, Tejaswini
    Perera, Imara Y.
    Ray, Shayoni
    Reinsch, Sigrid S.
    Rosenthal, Sara Brin
    Strong, Michael
    Szewczyk, Nathaniel J.
    Tahimic, Candice G. T.
    Taylor, Deanne M.
    Vandenbrink, Joshua P.
    Villacampa, Alicia
    Weging, Silvio
    Wolverton, Chris
    Wyatt, Sarah E.
    Zea, Luis
    Costes, Sylvain, V
    Galazka, Jonathan M.
    ISCIENCE, 2021, 24 (04)
  • [4] Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model
    Wang, Zengmiao
    Wang, Jun
    Wu, Changjing
    Deng, Minghua
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2015, 13 (06)
  • [5] Bayesian Hierarchical Model for Differential Gene Expression Using RNA-Seq Data
    Lee J.
    Ji Y.
    Liang S.
    Cai G.
    Müller P.
    Statistics in Biosciences, 2015, 7 (1) : 48 - 67
  • [6] Evaluation of the capacities of mouse TCR profiling from short read RNA-seq data
    Bai, Yu
    Wang, David
    Li, Wentian
    Huang, Ying
    Ye, Xuan
    Waite, Janelle
    Barry, Thomas
    Edelman, Kurt H.
    Levenkova, Natasha
    Guo, Chunguang
    Skokos, Dimitris
    Wei, Yi
    Macdonald, Lynn E.
    Fury, Wen
    PLOS ONE, 2018, 13 (11):
  • [7] HLA typing from RNA-Seq sequence reads
    Sebastian Boegel
    Martin Löwer
    Michael Schäfer
    Thomas Bukur
    Jos de Graaf
    Valesca Boisguérin
    Özlem Türeci
    Mustafa Diken
    John C Castle
    Ugur Sahin
    Genome Medicine, 4
  • [8] HLA typing from RNA-Seq sequence reads
    Boegel, Sebastian
    Loewer, Martin
    Schaefer, Michael
    Bukur, Thomas
    de Graaf, Jos
    Boisguerin, Valesca
    Tuereci, Oezlem
    Diken, Mustafa
    Castle, John C.
    Sahin, Ugur
    GENOME MEDICINE, 2012, 4
  • [9] dsRID: in silico identification of dsRNA regions using long-read RNA-seq data
    Yamamoto, Ryo
    Liu, Zhiheng
    Choudhury, Mudra
    Xiao, Xinshu
    BIOINFORMATICS, 2023, 39 (11)
  • [10] eQTL Mapping Using RNA-seq Data
    Sun W.
    Hu Y.
    Statistics in Biosciences, 2013, 5 (1) : 198 - 219