Biases in machine-learning models of human single-cell data

被引:0
|
作者
Theresa Willem [1 ]
Vladimir A. Shitov [2 ]
Malte D. Luecken [3 ]
Niki Kilbertus [4 ]
Stefan Bauer [3 ]
Marie Piraud [4 ]
Alena Buyx [2 ]
Fabian J. Theis [5 ]
机构
[1] Technical University of Munich,TUM School for Medicine and Health, Institute of History and Ethics in Medicine
[2] Helmholtz Munich,Department of Computational Health, Institute of Computational Biology
[3] Helmholtz Munich,Comprehensive Pneumology Center (CPC) with the CPC
[4] Helmholtz Munich; Member of the German Center for Lung Research (DZL),M bioArchive and Institute of Lung Health and Immunity (LHI)
[5] Technical University of Munich,School for Computation, Information and Technology
[6] Munich Center for Machine Learning (MCML),School of Life Sciences
[7] Technical University of Munich,undefined
关键词
D O I
10.1038/s41556-025-01619-8
中图分类号
学科分类号
摘要
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
引用
收藏
页码:384 / 392
页数:8
相关论文
共 50 条
  • [21] Machine learning methods for endocrine disrupting potential identification based on single-cell data
    Aghayev, Zahir
    Szafran, Adam T.
    Tran, Anh
    Ganesh, Hari S.
    Stossi, Fabio
    Zhou, Lan
    Mancini, Michael A.
    Pistikopoulos, Efstratios N.
    Beykal, Burcu
    CHEMICAL ENGINEERING SCIENCE, 2023, 281
  • [22] Data Quality Considerations for Petrophysical Machine-Learning Models1
    McDonald, Andrew
    PETROPHYSICS, 2021, 62 (06): : 585 - 613
  • [23] Machine-learning potential of a single pendulum
    Mandal, Swarnendu
    Sinha, Sudeshna
    Shrimali, Manish Dev
    PHYSICAL REVIEW E, 2022, 105 (05)
  • [24] Evaluation of machine learning approaches for cell-type identification from single-cell transcriptomics data
    Huang, Yixuan
    Zhang, Peng
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [25] Identification of Human Cell Cycle Phase Markers Based on Single-Cell RNA-Seq Data by Using Machine Learning Methods
    Huang, FeiMing
    Chen, Lei
    Guo, Wei
    Huang, Tao
    Cai, Yu-dong
    BIOMED RESEARCH INTERNATIONAL, 2022, 2022
  • [26] Profiling intratumoral heterogeneity of bladder cancer subtypes at the single-cell level using machine-learning assisted histopathology.
    van Rhijn, Bas
    Mertens, Laura
    Mayr, Roman
    Bostrom, Peter
    Marques, Mirari
    van Leenders, Geert
    Gotz, Stefanie
    van der Heijden, Michiel
    Jewett, Michael
    Real, Francisco
    Stohr, Robert
    Zlotta, Alexandre
    Eckstein, Markus
    Soorojebally, Yanish
    Burger, Max
    Otto, Wolfgang
    Radvanyi, Francois
    Pouessel, Damien
    van der Kwast, Theo
    Malats, Nuria
    Hartmann, Arndt
    Allory, Yves
    van der Schoot, Deric
    Zwarthoff, Ellen
    Zuiverloon, Tahlita
    CLINICAL CANCER RESEARCH, 2020, 26 (15) : 58 - 59
  • [27] A Machine-Learning Tool Concurrently Models Single Omics and Phenome Data for Functional Subtyping and Personalized Cancer Medicine
    Nyamundanda, Gift
    Eason, Katherine
    Guinney, Justin
    Lord, Christopher J.
    Sadanandam, Anguraj
    CANCERS, 2020, 12 (10) : 1 - 14
  • [28] Hierarchical progressive learning of cell identities in single-cell data
    Michielsen, Lieke
    Reinders, Marcel J. T.
    Mahfouz, Ahmed
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [29] Hierarchical progressive learning of cell identities in single-cell data
    Lieke Michielsen
    Marcel J. T. Reinders
    Ahmed Mahfouz
    Nature Communications, 12
  • [30] An orchestra of machine learning methods reveals landmarks in single-cell data exemplified with aging fibroblasts
    Rasbach, Lauritz
    Caliskan, Aylin
    Saderi, Fatemeh
    Dandekar, Thomas
    Breitenbach, Tim
    PLOS ONE, 2024, 19 (04):