Model-based clustering for flow and mass cytometry data with clinical information

被引:5
|
作者
Abe, Ko [1 ]
Minoura, Kodai [1 ,2 ]
Maeda, Yuka [3 ,4 ,5 ]
Nishikawa, Hiroyoshi [2 ,3 ,4 ,5 ]
Shimamura, Teppei [1 ]
机构
[1] Nagoya Univ, Grad Sch Med, Div Syst Biol, Showa Ku, 65 Tsurumai Cho, Nagoya, Aichi 4668550, Japan
[2] Nagoya Univ, Grad Sch Med, Div Immunol, Showa Ku, 65 Tsurumai Cho, Nagoya, Aichi 4668550, Japan
[3] Natl Canc Ctr, EPOC, Res Inst, Div Canc Immunol,Chuo Ku, Tsukiji 5-1-1, Kashiwa, Chiba, Japan
[4] Kashiwanoha 6-5-1, Tokyo 1040045, Japan
[5] Kashiwanoha 6-5-1, Chiba 2778577, Japan
关键词
Flow cytomety; Mass cytometory; Bayesian mixture model; Stochastic EM algorithm; DIAGNOSIS; CELLS;
D O I
10.1186/s12859-020-03671-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundHigh-dimensional flow cytometry and mass cytometry allow systemic-level characterization of more than 10 protein profiles at single-cell resolution and provide a much broader landscape in many biological applications, such as disease diagnosis and prediction of clinical outcome. When associating clinical information with cytometry data, traditional approaches require two distinct steps for identification of cell populations and statistical test to determine whether the difference between two population proportions is significant. These two-step approaches can lead to information loss and analysis bias.ResultsWe propose a novel statistical framework, called LAMBDA (Latent Allocation Model with Bayesian Data Analysis), for simultaneous identification of unknown cell populations and discovery of associations between these populations and clinical information. LAMBDA uses specified probabilistic models designed for modeling the different distribution information for flow or mass cytometry data, respectively. We use a zero-inflated distribution for the mass cytometry data based the characteristics of the data. A simulation study confirms the usefulness of this model by evaluating the accuracy of the estimated parameters. We also demonstrate that LAMBDA can identify associations between cell populations and their clinical outcomes by analyzing real data. LAMBDA is implemented in R and is available from GitHub (https://github.com/abikoushi/lambda).
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Model-based clustering for flow and mass cytometry data with clinical information
    Ko Abe
    Kodai Minoura
    Yuka Maeda
    Hiroyoshi Nishikawa
    Teppei Shimamura
    [J]. BMC Bioinformatics, 21
  • [2] Automated Gating of flow cytometry data via robust model-based clustering
    Lo, Kenneth
    Brinkman, Ryan Remy
    Gottardo, Raphael
    [J]. CYTOMETRY PART A, 2008, 73A (04) : 321 - 332
  • [3] Model-based cell clustering and population tracking for time-series flow cytometry data
    Kodai Minoura
    Ko Abe
    Yuka Maeda
    Hiroyoshi Nishikawa
    Teppei Shimamura
    [J]. BMC Bioinformatics, 20
  • [4] Model-based cell clustering and population tracking for time-series flow cytometry data
    Minoura, Kodai
    Abe, Ko
    Maeda, Yuka
    Nishikawa, Hiroyoshi
    Shimamura, Teppei
    [J]. BMC BIOINFORMATICS, 2019, 20 (01)
  • [5] Model-based cluster analysis applied to flow cytometry data
    Simon, U
    Mucha, HJ
    Brüggemann, R
    [J]. INNOVATIONS IN CLASSIFICATION, DATA SCIENCE, AND INFORMATION SYSTEMS, 2005, : 69 - 76
  • [6] SEQUENTIAL DIRICHLET PROCESS MIXTURES OF MULTIVARIATE SKEW t-DISTRIBUTIONS FOR MODEL-BASED CLUSTERING OF FLOW CYTOMETRY DATA
    Hejblum, Boris P.
    Alkhassim, Chariff
    Gottardo, Raphael
    Caron, Frakois
    Thiebaut, Rodolphe
    [J]. ANNALS OF APPLIED STATISTICS, 2019, 13 (01): : 638 - 660
  • [7] Model-based clustering of longitudinal data
    McNicholas, Paul D.
    Murphy, T. Brendan
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (01): : 153 - 168
  • [8] Boosting for model-based data clustering
    Saffari, Amir
    Bischof, Horst
    [J]. PATTERN RECOGNITION, 2008, 5096 : 51 - 60
  • [9] Model-based clustering for longitudinal data
    De la Cruz-Mesia, Rolando
    Quintanab, Fernando A.
    Marshall, Guillermo
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) : 1441 - 1457
  • [10] Model-Based Clustering of Temporal Data
    El Assaad, Hani
    Same, Allou
    Govaert, Gerard
    Aknin, Patrice
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 9 - 16