Forseti: a mechanistic and predictive model of the splicing status of scRNA-seq reads

被引:0
|
作者
He, Dongze [1 ,2 ]
Gao, Yuan [1 ,2 ]
Chan, Spencer Skylar [3 ]
Quintana-Parrilla, Natalia [4 ]
Patro, Rob [1 ,3 ]
机构
[1] Univ Maryland, Ctr Bioinformat & Computat Biol, College Pk, MD 20742 USA
[2] Univ Maryland, Program Computat Biol Bioinformat & Genom, College Pk, MD 20742 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[4] Univ Puerto Rico, Dept Biol, Mayaguez Campus, Mayaguez, PR 00682 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/btae207
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Short-read single-cell RNA-sequencing (scRNA-seq) has been used to study cellular heterogeneity, cellular fate, and transcriptional dynamics. Modeling splicing dynamics in scRNA-seq data is challenging, with inherent difficulty in even the seemingly straightforward task of elucidating the splicing status of the molecules from which sequenced fragments are drawn. This difficulty arises, in part, from the limited read length and positional biases, which substantially reduce the specificity of the sequenced fragments. As a result, the splicing status of many reads in scRNA-seq is ambiguous because of a lack of definitive evidence. We are therefore in need of methods that can recover the splicing status of ambiguous reads which, in turn, can lead to more accuracy and confidence in downstream analyses. Results: We develop Forseti, a predictive model to probabilistically assign a splicing status to scRNA-seq reads. Our model has two key components. First, we train a binding affinity model to assign a probability that a given transcriptomic site is used in fragment generation. Second, we fit a robust fragment length distribution model that generalizes well across datasets deriving from different species and tissue types. Forseti combines these two trained models to predict the splicing status of the molecule of origin of reads by scoring putative fragments that associate each alignment of sequenced reads with proximate potential priming sites. Using both simulated and experimental data, we show that our model can precisely predict the splicing status of many reads and identify the true gene origin of multi-gene mapped reads.
引用
收藏
页码:i297 / i306
页数:10
相关论文
共 50 条
  • [1] ELLIPSIS: robust quantification of splicing in scRNA-seq
    Van Hecke, Marie
    Beerenwinkel, Niko
    Lootens, Thibault
    Fostier, Jan
    Raedt, Robrecht
    Marchal, Kathleen
    BIOINFORMATICS, 2025, 41 (02)
  • [2] Mining alternative splicing patterns in scRNA-seq data using scASfind
    Song, Yuyao
    Parada, Guillermo
    Lee, Jimmy Tsz Hang
    Hemberg, Martin
    GENOME BIOLOGY, 2024, 25 (01):
  • [3] Vulture: cloud-enabled scalable mining of microbial reads in public scRNA-seq data
    Chen, Junyi
    Yin, Danqing
    Wong, Harris Y. H.
    Duan, Xin
    Yu, Ken H. O.
    Ho, Joshua W. K.
    GIGASCIENCE, 2024, 13
  • [4] Intrinsic entropy model for feature selection of scRNA-seq data
    Li, Lin
    Tang, Hui
    Xia, Rui
    Dai, Hao
    Liu, Rui
    Chen, Luonan
    JOURNAL OF MOLECULAR CELL BIOLOGY, 2022, 14 (02)
  • [5] Detecting differential alternative splicing events in scRNA-seq with or without Unique Molecular Identifiers
    Hu, Yu
    Wang, Kai
    Li, Mingyao
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (06)
  • [6] scTPC: a novel semisupervised deep clustering model for scRNA-seq data
    Qiu, Yushan
    Yang, Lingfei
    Jiang, Hao
    Zou, Quan
    BIOINFORMATICS, 2024, 40 (05)
  • [7] SCC: an accurate imputation method for scRNA-seq dropouts based on a mixture model
    Zheng, Yan
    Zhong, Yuanke
    Hu, Jialu
    Shang, Xuequn
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [8] SCC: an accurate imputation method for scRNA-seq dropouts based on a mixture model
    Yan Zheng
    Yuanke Zhong
    Jialu Hu
    Xuequn Shang
    BMC Bioinformatics, 22
  • [9] scDMAE: A Generative Denoising Model Adopted Mask Strategy for scRNA-Seq Data Recovery
    Liu, Wei
    Pan, Youze
    Teng, Zhijie
    Xu, Junlin
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (06) : 3772 - 3780
  • [10] scCRT: a contrastive-based dimensionality reduction model for scRNA-seq trajectory inference
    Shi, Yuchen
    Wan, Jian
    Zhang, Xin
    Liang, Tingting
    Yin, Yuyu
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)