Medoidshift clustering applied to genomic bulk tumor data

被引:3
|
作者
Roman, Theodore [1 ,2 ]
Xie, Lu [1 ,2 ]
Schwartz, Russell [1 ,3 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Computat Biol Dept, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Joint Carnegie Mellon Univ Pittsburgh, PhD Program Computat Biol, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[3] Carnegie Mellon Univ, Mellon Coll Sci, Dept Biol Sci, 4400 Fifth Ave, Pittsburgh, PA 15213 USA
来源
BMC GENOMICS | 2016年 / 17卷
关键词
Computational biology; Clustering; Tumor; Heterogeneity; INTRATUMOR HETEROGENEITY; SEQUENCING REVEALS; GENETIC-ANALYSIS; WHOLE-GENOME; IN-SITU; CELL; EVOLUTION; EXPRESSION; MUTATIONS; CARCINOMA;
D O I
10.1186/s12864-015-2302-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Despite the enormous medical impact of cancers and intensive study of their biology, detailed characterization of tumor growth and development remains elusive. This difficulty occurs in large part because of enormous heterogeneity in the molecular mechanisms of cancer progression, both tumor-to-tumor and cell-to-cell in single tumors. Advances in genomic technologies, especially at the single-cell level, are improving the situation, but these approaches are held back by limitations of the biotechnologies for gathering genomic data from heterogeneous cell populations and the computational methods for making sense of those data. One popular way to gain the advantages of whole-genome methods without the cost of single-cell genomics has been the use of computational deconvolution (unmixing) methods to reconstruct clonal heterogeneity from bulk genomic data. These methods, too, are limited by the difficulty of inferring genomic profiles of rare or subtly varying clonal subpopulations from bulk data, a problem that can be computationally reduced to that of reconstructing the geometry of point clouds of tumor samples in a genome space. Here, we present a new method to improve that reconstruction by better identifying subspaces corresponding to tumors produced from mixtures of distinct combinations of clonal subpopulations. We develop a nonparametric clustering method based on medoidshift clustering for identifying subgroups of tumors expected to correspond to distinct trajectories of evolutionary progression. We show on synthetic and real tumor copy-number data that this new method substantially improves our ability to resolve discrete tumor subgroups, a key step in the process of accurately deconvolving tumor genomic data and inferring clonal heterogeneity from bulk data.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Gclust: A Parallel Clustering Tool for Microbial Genomic Data
    Li, Ruilin
    He, Xiaoyu
    Dai, Chuangchuang
    Zhu, Haidong
    Lang, Xianyu
    Chen, Wei
    Li, Xiaodong
    Zhao, Dan
    Zhang, Yu
    Han, Xinyin
    Niu, Tie
    Zhao, Yi
    Cao, Rongqiang
    He, Rong
    Lu, Zhonghua
    Chi, Xuebin
    Li, Weizhong
    Niu, Beifang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (05) : 496 - 502
  • [22] ENHANCED STREAMING BASED SUBSPACE CLUSTERING APPLIED TO ACOUSTIC SCENE DATA CLUSTERING
    Li, Shuoyang
    Gu, Yuantao
    Luo, Yuhui
    Chambers, Jonathon
    Wang, Wenwu
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 11 - 15
  • [23] A Genomic Selection Index Applied to Simulated and Real Data
    Jesus Ceron-Rojas, J.
    Crossa, Jose
    Arief, Vivi N.
    Basford, Kaye
    Rutkoski, Jessica
    Jarquin, Diego
    Alvarado, Gregorio
    Beyene, Yoseph
    Semagn, Kassa
    DeLacy, Ian
    G3-GENES GENOMES GENETICS, 2015, 5 (10): : 2155 - 2164
  • [24] Evolutionary Techniques for Hierarchical Clustering Applied to Microarray Data
    Castellanos-Garzon, Jos A.
    Miguel-Quintales, Luis A.
    2ND INTERNATIONAL WORKSHOP ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (IWPACBB 2008), 2009, 49 : 118 - 127
  • [25] Particle Swarm Optimization applied to Relational Data Clustering
    de Gusmao, Rene Pereira
    Tenrio de Carvalho, Francisco de Assis
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 1690 - 1695
  • [26] Genetic algorithms applied to clustering problem and data mining
    Jimenez, J. F.
    Cuevas, F. J.
    Carpio, J. M.
    NEW ADVANCES IN SIMULATION, MODELLING AND OPTIMIZATION (SMO '07), 2007, : 219 - +
  • [27] Clustering Methods Applied for Gene Expression Data: A Study
    Gupta, Shelly
    Singh, Shailender Narayan
    Kumar, Dharminder
    2016 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2016, : 724 - 728
  • [28] Joint co-clustering: Co-clustering of genomic and clinical bioimaging data
    Ficarra, Elisa
    De Micheli, Giovanni
    Yoon, Sungroh
    Benini, Luca
    Macii, Enrico
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (05) : 938 - 949
  • [29] Subset clustering of binary sequences, with an application to genomic abnormality data
    Hoff, PD
    BIOMETRICS, 2005, 61 (04) : 1027 - 1036
  • [30] An Ultra-Fast Method for Clustering of Big Genomic Data
    Kenidra, Billel
    Benmohammed, Mohamed
    INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING, 2020, 11 (01) : 45 - 60