stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets

被引:0
|
作者
Obulkasim, Askar [1 ]
van de Wiel, Mark A. [2 ]
机构
[1] Vrije Univ Amsterdam, Med Ctr, Dept Epidemiol & Biostat, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands
关键词
classification; data integration; high-dimensional data; R package;
D O I
10.4137/CIN.S13075
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
This paper presents the R/Bioconductor package stepwiseCM, which classifies cancer samples using two heterogeneous data sets in an efficient way. The algorithm is able to capture the distinct classification power of two given data types without actually combining them. This package suits for classification problems where two different types of data sets on the same samples are available. One of these data types has measurements on all samples and the other one has measurements on some samples. One is easy to collect and/or relatively cheap (eg, clinical covariates) compared to the latter (high-dimensional data, eg, gene expression). One additional application for which stepwiseCM is proven to be useful as well is the combination of two highdimensional data types, eg, DNA copy number and mRNA expression. The package includes functions to project the neighborhood information in one data space to the other to determine a potential group of samples that are likely to benefit most by measuring the second type of covariates. The two heterogeneous data spaces are connected by indirect mapping. The crucial difference between the stepwise classification strategy implemented in this package and the existing packages is that our approach aims to be cost-efficient by avoiding measuring additional covariates, which might be expensive or patient-unfriendly, for a potentially large subgroup of individuals. Moreover, in diagnosis for these individuals test, results would be quickly available, which may lead to reduced waiting times and hence lower the patients' distress. The improvement described remedies the key limitations of existing packages, and facilitates the use of the stepwiseCM package in diverse applications.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [1] Stepwise classification of cancer samples using clinical and molecular data
    Obulkasim, Askar
    Meijer, Gerrit A.
    van de Wiel, Mark A.
    BMC BIOINFORMATICS, 2011, 12
  • [2] Stepwise classification of cancer samples using clinical and molecular data
    Askar Obulkasim
    Gerrit A Meijer
    Mark A van de Wiel
    BMC Bioinformatics, 12
  • [3] MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration
    Hernandez-Ferrer, Carles
    Ruiz-Arenas, Carlos
    Beltran-Gomila, Alba
    Gonzalez, Juan R.
    BMC BIOINFORMATICS, 2017, 18
  • [4] MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration
    Carles Hernandez-Ferrer
    Carlos Ruiz-Arenas
    Alba Beltran-Gomila
    Juan R. González
    BMC Bioinformatics, 18
  • [5] TRONCO: an R package for the inference of cancer progression models from heterogeneous genomic data
    De Sano, Luca
    Caravagna, Giulio
    Ramazzotti, Daniele
    Graudenzi, Alex
    Mauri, Giancarlo
    Mishra, Bud
    Antoniotti, Marco
    BIOINFORMATICS, 2016, 32 (12) : 1911 - 1913
  • [6] CIMTx: An R Package for Causal Inference with Multiple Treatments using Observational Data
    Hu, Lianyuan
    Ji, Jiayi
    R JOURNAL, 2022, 14 (03): : 213 - 230
  • [7] image2data: An R package to turn images in data sets
    Caron, Pier-Olivier
    Dufresne, Alexandre
    QUANTITATIVE METHODS FOR PSYCHOLOGY, 2022, 18 (02): : 186 - 195
  • [8] Cancer survival classification using integrated data sets and intermediate information
    Kim, Shinuk
    Park, Taesung
    Kon, Mark
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2014, 62 (01) : 23 - 31
  • [9] Multiple-Table Data in R with the multitable Package
    Walker, Steven C.
    Guenard, Guillaume
    Solymos, Peter
    Legendre, Pierre
    JOURNAL OF STATISTICAL SOFTWARE, 2012, 51 (08): : 1 - 38
  • [10] Robust Maximum Association Between Data Sets: The R Package ccaPP
    Alfons, Andreas
    Croux, Christophe
    Filzmoser, Peter
    AUSTRIAN JOURNAL OF STATISTICS, 2016, 45 (01) : 71 - 79