Biases in machine-learning models of human single-cell data

被引:0
|
作者
Theresa Willem [1 ]
Vladimir A. Shitov [2 ]
Malte D. Luecken [3 ]
Niki Kilbertus [4 ]
Stefan Bauer [3 ]
Marie Piraud [4 ]
Alena Buyx [2 ]
Fabian J. Theis [5 ]
机构
[1] Technical University of Munich,TUM School for Medicine and Health, Institute of History and Ethics in Medicine
[2] Helmholtz Munich,Department of Computational Health, Institute of Computational Biology
[3] Helmholtz Munich,Comprehensive Pneumology Center (CPC) with the CPC
[4] Helmholtz Munich; Member of the German Center for Lung Research (DZL),M bioArchive and Institute of Lung Health and Immunity (LHI)
[5] Technical University of Munich,School for Computation, Information and Technology
[6] Munich Center for Machine Learning (MCML),School of Life Sciences
[7] Technical University of Munich,undefined
关键词
D O I
10.1038/s41556-025-01619-8
中图分类号
学科分类号
摘要
Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.
引用
收藏
页码:384 / 392
页数:8
相关论文
共 50 条
  • [41] Synchronization of chaotic systems and their machine-learning models
    Weng, Tongfeng
    Yang, Huijie
    Gu, Changgui
    Zhang, Jie
    Small, Michael
    PHYSICAL REVIEW E, 2019, 99 (04)
  • [42] Machine-learning models for combinatorial catalyst discovery
    Landrum, GA
    Penzotti, J
    Putta, S
    COMBINATORIAL AND ARTIFICIAL INTELLIGENCE METHODS IN MATERIALS SCIENCE II, 2004, 804 : 301 - 306
  • [43] Machine-learning models for combinatorial catalyst discovery
    Landrum, GA
    Penzotti, JE
    Putta, S
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2005, 16 (01) : 270 - 277
  • [44] The Importance of Interpretability and Validations of Machine-Learning Models
    Yamasawa, Daisuke
    Ozawa, Hideki
    Goto, Shinichi
    CIRCULATION JOURNAL, 2024, 88 (01) : 157 - 158
  • [45] Identifying In Vitro Cultured Human Hepatocytes Markers with Machine Learning Methods Based on Single-Cell RNA-Seq Data
    Li, ZhanDong
    Huang, FeiMing
    Chen, Lei
    Huang, Tao
    Cai, Yu-Dong
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 10
  • [46] Data denoising with transfer learning in single-cell transcriptomics
    Wang, Jingshu
    Agarwal, Divyansh
    Huang, Mo
    Hu, Gang
    Zhou, Zilu
    Ye, Chengzhong
    Zhang, Nancy R.
    NATURE METHODS, 2019, 16 (09) : 875 - +
  • [47] Data denoising with transfer learning in single-cell transcriptomics
    Jingshu Wang
    Divyansh Agarwal
    Mo Huang
    Gang Hu
    Zilu Zhou
    Chengzhong Ye
    Nancy R. Zhang
    Nature Methods, 2019, 16 : 875 - 878
  • [48] An introduction to representation learning for single-cell data analysis
    Gunawan, Ihuan
    Vafaee, Fatemeh
    Meijering, Erik
    Lock, John George
    CELL REPORTS METHODS, 2023, 3 (08):
  • [49] Machine Learning Challenges for Single Cell Data
    Van Gassen, Sofie
    Dhaene, Tom
    Saeys, Yvan
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2016, PT III, 2016, 9853 : 275 - 279
  • [50] Learning Single-Cell Distances from Cytometry Data
    Bac Nguyen
    Rubbens, Peter
    Kerckhof, Frederiek-Maarten
    Boon, Nico
    De Baets, Bernard
    Waegeman, Willem
    CYTOMETRY PART A, 2019, 95A (07) : 782 - 791