The (in)dependence of single-cell data inferences on model constructs

被引:0
|
作者
Grgicak, Catherine M. [1 ,2 ]
Slooten, Klaas [3 ,4 ]
Cowell, Robert G. [5 ]
Bhembe, Qhawe [2 ]
Lun, Desmond S. [2 ,6 ]
机构
[1] Rutgers State Univ, Dept Chem, Program Forens Sci, Camden, NJ 08102 USA
[2] Rutgers State Univ, Ctr Computat & Integrat Biol, Camden, NJ 08102 USA
[3] Netherlands Forens Inst, POB 24044, NL-2490 AA The Hague, Netherlands
[4] Vrije Univ Amsterdam, De Boelelaan 1081, NL-1081 HV Amsterdam, Netherlands
[5] City Univ London, London, England
[6] Rutgers State Univ, Dept Comp Sci, 315 Penn St R306A, Camden, NJ 08102 USA
关键词
Forensic DNA; Single-cell forensics; Single-cell genetics; Single-cell inference; Likelihood ratio; Probabilistic genotyping; EESCIt; TD; DCM; LR calibration; DEVELOPMENTAL VALIDATION; DNA MIXTURES; LOW-TEMPLATE; MULTIPLEX; PROPOSITIONS; SYSTEM; NUMBER; LEVEL;
D O I
10.1016/j.fsigen.2024.103220
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Recent developments in single-cell analysis have revolutionized basic research and have garnered the attention of the forensic domain. Though single-cell analysis is not new to forensics, the ways in which these data can be generated and interpreted are. Modern interpretation strategies report likelihood ratios that rely on a model of the world that is a simplification of it. It is, therefore, plausible that different reasonable models will assign noticeably different weights of evidence (WoEs) to some of these data, resulting in inconsistent reports and protracted reviews of that evidence, potentially across years. With one goal of research being to identify and understand sources of inconsistencies during early stages, we undertake a study that evaluates WoE at the limit of one single-cell electropherogram (scEPG) across three architecturally distinct probabilistic models. The three are named EESCIt (Evidentiary Evaluation of Single Cells), TD (Top-Down), and DCM (Discrete Cell Model). To do this, we performance test the three models on a set of 996 individual scEPGs and conduct one H-1-true, i.e., true contributor, and 201 H-2-true, i.e., false contributor, tests, per scEPG. With the 201,192 outcomes per model, we confirm that scEPGs well resolve the hypotheses, regardless of what model was applied. We also observe that WoEs increase, on average, by 1 for every 1000 RFU of total intensity added until a plateau near the logarithm of the inverse of the random match probability is reached at ca. 22,000 RFU. By querying WoE calibration for each model, we determine if the evidence is over- or under-stated for any one of them. We find that for WoE >= -1 hardly any calibration discrepancy is observed. There were rare instances, however, for which WoEs that were <= -1 too strongly pointed in the negative direction, though H-1 was true. This was the result of five scEPGs that not only exhibited extreme signal in stutter positions, but also carried little information in other loci. These findings show that all three models appropriately stated WoEs for scEPGs when reporting positive WoE, and the two continuous model's WoE reasonably represented the findings when WoE < -1 for most loci. To further explore, we continued with paired analyses that evaluated the agreement in WoE, per scEPG, across models. Unlike unpaired analyses, this evaluation determines if well performing models return equivalent results for the same scEPG. The paired analysis was summarized by way of intraclass correlations, which were at least 0.99997. Further, we found that 762 of 996 WoEs were within a range of 3 orders of magnitude of each other, though many of these were associated with WoEs that were large, i.e., > 9, in the first instance. When we more closely focus on scEPGs giving ranges >= 3, but whose WoE <= 9 for at least one of the models, we find there are 21 of them. When we perform a locus-by-locus investigation of these 21 and of the five scEPGs returning too strong negative WoE for true contributors we find that extreme stutter is usually the cause of the challenges. To ameliorate differences in predicting rare, though impactful, events we proffer interpretive adaptions that extend beyond manually addressing the phenomena. With the WoE being calibrated within their relevant regions across EESCIt, TD and DCM, we categorize each as meeting the pillar of legitimacy for single-cell data within their intended WoE ranges.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] scHiGex: predicting single-cell gene expression based on single-cell Hi-C data
    Shrestha, Bishal
    Siciliano, Andrew Jordan
    Zhu, Hao
    Liu, Tong
    Wang, Zheng
    NAR GENOMICS AND BIOINFORMATICS, 2025, 7 (01)
  • [42] Optimal Gene Filtering for Single-Cell data (OGFSC)-a gene filtering algorithm for single-cell RNA-seq data
    Hao, Jie
    Cao, Wei
    Huang, Jian
    Zou, Xin
    Han, Ze-Guang
    BIOINFORMATICS, 2019, 35 (15) : 2602 - 2609
  • [43] scDA: Single cell discriminant analysis for single-cell RNA sequencing data
    Shi, Qianqian
    Li, Xinxing
    Peng, Qirui
    Zhang, Chuanchao
    Chen, Luonan
    Computational and Structural Biotechnology Journal, 2021, 19 : 3234 - 3244
  • [44] scDA: Single cell discriminant analysis for single-cell RNA sequencing data
    Shi, Qianqian
    Li, Xinxing
    Peng, Qirui
    Zhang, Chuanchao
    Chen, Luonan
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 3234 - 3244
  • [45] Characterization of cell fate probabilities in single-cell data with Palantir
    Setty, Manu
    Kiseliovas, Vaidotas
    Levine, Jacob
    Gayoso, Adam
    Mazutis, Linas
    Pe'er, Dana
    NATURE BIOTECHNOLOGY, 2019, 37 (04) : 451 - +
  • [46] Hierarchical progressive learning of cell identities in single-cell data
    Michielsen, Lieke
    Reinders, Marcel J. T.
    Mahfouz, Ahmed
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [47] Characterization of cell fate probabilities in single-cell data with Palantir
    Manu Setty
    Vaidotas Kiseliovas
    Jacob Levine
    Adam Gayoso
    Linas Mazutis
    Dana Pe’er
    Nature Biotechnology, 2019, 37 : 451 - 460
  • [48] UCSC Cell Browser: visualize your single-cell data
    Speir, Matthew L.
    Bhaduri, Aparna
    Markov, Nikolay S.
    Moreno, Pablo
    Nowakowski, Tomasz J.
    Papatheodorou, Irene
    Pollen, Alex A.
    Raney, Brian J.
    Seninge, Lucas
    Kent, W. James
    Haeussler, Maximilian
    BIOINFORMATICS, 2021, 37 (23) : 4578 - 4580
  • [49] Hierarchical progressive learning of cell identities in single-cell data
    Lieke Michielsen
    Marcel J. T. Reinders
    Ahmed Mahfouz
    Nature Communications, 12
  • [50] Deeper evaluation of a single-cell foundation model
    Boiarsky, Rebecca
    Singh, Nalini M.
    Buendia, Alejandro
    Amini, Ava P.
    Getz, Gad
    Sontag, David
    NATURE MACHINE INTELLIGENCE, 2024, 6 (12) : 1443 - 1446