scRAA: the development of a robust and automatic annotation procedure for single-cell RNA sequencing data

被引:0
|
作者
Yan, Dongyan [1 ]
Sun, Zhe [1 ]
Fang, Jiyuan [1 ]
Cao, Shanshan [1 ]
Wang, Wenjie [2 ]
Chang, Xinyue [2 ]
Badirli, Sarkhan [2 ]
Fu, Haoda [2 ]
Liu, Yushi [1 ,3 ]
机构
[1] Eli Lilly & Co, Global Stat Sci, Indianapolis, IN USA
[2] Eli Lilly & Co, Adv Analyt & Data Sci, Indianapolis, IN USA
[3] Eli Lilly & Co, Global Stat Sci, 893 Delaware St, Indianapolis, IN 46225 USA
关键词
Batch effect correction; cell-type classification; ensembled method; SEQ;
D O I
10.1080/10543406.2023.2208671
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
A critical task in single-cell RNA sequencing (scRNA-Seq) data analysis is to identify cell types from heterogeneous tissues. While the majority of classification methods demonstrated high performance in scRNA-Seq annotation problems, a robust and accurate solution is desired to generate reliable outcomes for downstream analyses, for instance, marker genes identification, differentially expressed genes, and pathway analysis. It is hard to establish a universally good metric. Thus, a universally good classification method for all kinds of scenarios does not exist. In addition, reference and query data in cell classification are usually from different experimental batches, and failure to consider batch effects may result in misleading conclusions. To overcome this bottleneck, we propose a robust ensemble approach to classify cells and utilize a batch correction method between reference and query data. We simulated four scenarios that comprise simple to complex batch effect and account for varying cell-type proportions. We further tested our approach on both lung and pancreas data. We found improved prediction accuracy and robust performance across simulation scenarios and real data. The incorporation of batch effect correction between reference and query, and the ensemble approach improve cell-type prediction accuracy while maintaining robustness. We demonstrated these through simulated and real scRNA-Seq data.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] A comparison of integration methods for single-cell RNA sequencing data and ATAC sequencing data
    Kan, Yulong
    Wang, Weihao
    Qi, Yunjing
    Zhang, Zhongxiao
    Liang, Xikeng
    Jin, Shuilin
    QUANTITATIVE BIOLOGY, 2025, 13 (02)
  • [32] MISC: missing imputation for single-cell RNA sequencing data
    Yang, Mary Qu
    Weissman, Sherman M.
    Yang, William
    Zhang, Jialing
    Canaann, Allon
    Guan, Renchu
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [33] An Introduction to the Analysis of Single-Cell RNA-Sequencing Data
    AlJanahi, Aisha A.
    Danielsen, Mark
    Dunbar, Cynthia E.
    MOLECULAR THERAPY-METHODS & CLINICAL DEVELOPMENT, 2018, 10 : 189 - 196
  • [34] SNV identification from single-cell RNA sequencing data
    Schnepp, Patricia M.
    Chen, Mengjie
    Keller, Evan T.
    Zhou, Xiang
    HUMAN MOLECULAR GENETICS, 2019, 28 (21) : 3569 - 3583
  • [35] Normalizing single-cell RNA sequencing data: Challenges and opportunities
    Vallejos C.A.
    Risso D.
    Scialdone A.
    Dudoit S.
    Marioni J.C.
    Nature Methods, 2017, 14 (6) : 565 - 571
  • [36] Analysis of single-cell RNA sequencing data based on autoencoders
    Andrea Tangherloni
    Federico Ricciuti
    Daniela Besozzi
    Pietro Liò
    Ana Cvejic
    BMC Bioinformatics, 22
  • [37] SCRIP: an accurate simulator for single-cell RNA sequencing data
    Qin, Fei
    Luo, Xizhi
    Xiao, Feifei
    Cai, Guoshuai
    BIOINFORMATICS, 2022, 38 (05) : 1304 - 1311
  • [38] The shaky foundations of simulating single-cell RNA sequencing data
    Crowell, Helena L.
    Leonardo, Sarah X. Morillo X.
    Soneson, Charlotte
    Robinson, Mark D.
    GENOME BIOLOGY, 2023, 24 (01)
  • [39] Analysis of single-cell RNA sequencing data based on autoencoders
    Tangherloni, Andrea
    Ricciuti, Federico
    Besozzi, Daniela
    Lio, Pietro
    Cvejic, Ana
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [40] Normalizing single-cell RNA sequencing data: challenges and opportunities
    Vallejos, Catalina A.
    Risso, Davide
    Scialdone, Antonio
    Dudoit, Sandrine
    Marioni, John C.
    NATURE METHODS, 2017, 14 (06) : 565 - 571