The (in)dependence of single-cell data inferences on model constructs

被引:0
|
作者
Grgicak, Catherine M. [1 ,2 ]
Slooten, Klaas [3 ,4 ]
Cowell, Robert G. [5 ]
Bhembe, Qhawe [2 ]
Lun, Desmond S. [2 ,6 ]
机构
[1] Rutgers State Univ, Dept Chem, Program Forens Sci, Camden, NJ 08102 USA
[2] Rutgers State Univ, Ctr Computat & Integrat Biol, Camden, NJ 08102 USA
[3] Netherlands Forens Inst, POB 24044, NL-2490 AA The Hague, Netherlands
[4] Vrije Univ Amsterdam, De Boelelaan 1081, NL-1081 HV Amsterdam, Netherlands
[5] City Univ London, London, England
[6] Rutgers State Univ, Dept Comp Sci, 315 Penn St R306A, Camden, NJ 08102 USA
关键词
Forensic DNA; Single-cell forensics; Single-cell genetics; Single-cell inference; Likelihood ratio; Probabilistic genotyping; EESCIt; TD; DCM; LR calibration; DEVELOPMENTAL VALIDATION; DNA MIXTURES; LOW-TEMPLATE; MULTIPLEX; PROPOSITIONS; SYSTEM; NUMBER; LEVEL;
D O I
10.1016/j.fsigen.2024.103220
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Recent developments in single-cell analysis have revolutionized basic research and have garnered the attention of the forensic domain. Though single-cell analysis is not new to forensics, the ways in which these data can be generated and interpreted are. Modern interpretation strategies report likelihood ratios that rely on a model of the world that is a simplification of it. It is, therefore, plausible that different reasonable models will assign noticeably different weights of evidence (WoEs) to some of these data, resulting in inconsistent reports and protracted reviews of that evidence, potentially across years. With one goal of research being to identify and understand sources of inconsistencies during early stages, we undertake a study that evaluates WoE at the limit of one single-cell electropherogram (scEPG) across three architecturally distinct probabilistic models. The three are named EESCIt (Evidentiary Evaluation of Single Cells), TD (Top-Down), and DCM (Discrete Cell Model). To do this, we performance test the three models on a set of 996 individual scEPGs and conduct one H-1-true, i.e., true contributor, and 201 H-2-true, i.e., false contributor, tests, per scEPG. With the 201,192 outcomes per model, we confirm that scEPGs well resolve the hypotheses, regardless of what model was applied. We also observe that WoEs increase, on average, by 1 for every 1000 RFU of total intensity added until a plateau near the logarithm of the inverse of the random match probability is reached at ca. 22,000 RFU. By querying WoE calibration for each model, we determine if the evidence is over- or under-stated for any one of them. We find that for WoE >= -1 hardly any calibration discrepancy is observed. There were rare instances, however, for which WoEs that were <= -1 too strongly pointed in the negative direction, though H-1 was true. This was the result of five scEPGs that not only exhibited extreme signal in stutter positions, but also carried little information in other loci. These findings show that all three models appropriately stated WoEs for scEPGs when reporting positive WoE, and the two continuous model's WoE reasonably represented the findings when WoE < -1 for most loci. To further explore, we continued with paired analyses that evaluated the agreement in WoE, per scEPG, across models. Unlike unpaired analyses, this evaluation determines if well performing models return equivalent results for the same scEPG. The paired analysis was summarized by way of intraclass correlations, which were at least 0.99997. Further, we found that 762 of 996 WoEs were within a range of 3 orders of magnitude of each other, though many of these were associated with WoEs that were large, i.e., > 9, in the first instance. When we more closely focus on scEPGs giving ranges >= 3, but whose WoE <= 9 for at least one of the models, we find there are 21 of them. When we perform a locus-by-locus investigation of these 21 and of the five scEPGs returning too strong negative WoE for true contributors we find that extreme stutter is usually the cause of the challenges. To ameliorate differences in predicting rare, though impactful, events we proffer interpretive adaptions that extend beyond manually addressing the phenomena. With the WoE being calibrated within their relevant regions across EESCIt, TD and DCM, we categorize each as meeting the pillar of legitimacy for single-cell data within their intended WoE ranges.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Optimizing biological inferences from single-cell data
    Nature Reviews Genetics, 2019, 20 : 249 - 249
  • [2] Optimizing biological inferences from single-cell data
    不详
    NATURE REVIEWS GENETICS, 2019, 20 (05) : 249 - 249
  • [3] A versatile model for single-cell data analysis
    Jun Ding
    Nature Computational Science, 2021, 1 : 460 - 461
  • [4] A versatile model for single-cell data analysis
    Ding, Jun
    NATURE COMPUTATIONAL SCIENCE, 2021, 1 (07): : 460 - 461
  • [5] Integrated Bulk and Single-cell RNA Sequencing Data Constructs and Validates a Prognostic Model for Non-small Cell Lung Cancer
    Zhu, Junkai
    Yang, Junluo
    Chen, Xinyi
    Wang, Yang
    Wang, Xin
    Zhao, Mengmeng
    Li, Guanjie
    Wang, Yuhang
    Zhu, Yuyao
    Yan, Fangrong
    Liu, Tiantian
    Jiang, Liyun
    JOURNAL OF CANCER, 2024, 15 (03): : 796 - 808
  • [6] xSiGra: explainable model for single-cell spatial data elucidation
    Budhkar, Aishwarya
    Tang, Ziyang
    Liu, Xiang
    Zhang, Xuhong
    Su, Jing
    Song, Qianqian
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (05)
  • [7] scCODA is a Bayesian model for compositional single-cell data analysis
    Buettner, M.
    Ostner, J.
    Mueller, C. L.
    Theis, F. J.
    Schubert, B.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [8] scCODA is a Bayesian model for compositional single-cell data analysis
    M. Büttner
    J. Ostner
    C. L. Müller
    F. J. Theis
    B. Schubert
    Nature Communications, 12
  • [9] Evaluation of single-cell classifiers for single-cell RNA sequencing data sets
    Zhao, Xinlei
    Wu, Shuang
    Fang, Nan
    Sun, Xiao
    Fan, Jue
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (05) : 1581 - 1595
  • [10] Cell type prioritization in single-cell data
    Skinnider, Michael A.
    Squair, Jordan W.
    Kathe, Claudia
    Anderson, Mark A.
    Gautier, Matthieu
    Matson, Kaya J. E.
    Milano, Marco
    Hutson, Thomas H.
    Barraud, Quentin
    Phillips, Aaron A.
    Foster, Leonard J.
    La Manno, Gioele
    Levine, Ariel J.
    Courtine, Gregoire
    NATURE BIOTECHNOLOGY, 2021, 39 (01) : 30 - 34