Utilization of virtual samples to facilitate cancer identification for DNA microarray data in the early stages of an investigation

被引:22
|
作者
Li, Der-Chiang [2 ]
Fang, Yao-Hwei [3 ]
Lai, Yung-Yao [2 ]
Hu, Susan C. [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Publ Hlth, Coll Med, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ, Dept Ind & Informat Management, Tainan 701, Taiwan
[3] Natl Hlth Res Inst, Div Biostat & Bioinformat, Zhunan 350, Miaoli Country, Taiwan
关键词
Classification; DNA microarray; Gene selection; Small-sample problem; Virtual Sample Generation; CLASSIFICATION METHODS; KNOWLEDGE; SELECTION; GENES; TUMOR;
D O I
10.1016/j.ins.2009.04.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
DNA microarray datasets are generally small in size, high dimensional with many non-discriminative genes, and non-linear with outliers. Their size/dimension ratio suggests that DNA microarray datasets are identified as small-sample problems. Recently, researchers have developed various gene selection algorithms to discover genes that are most relevant to a specific disease, and thus to reduce computation. Most gene selection algorithms improve learning performance and efficiency, but still suffer from the limitation of insufficient training samples in the datasets. Moreover, in the early stage of diagnosing a new disease, very limited data can be obtained. Therefore, the derived diagnostic model is usually unreliable to identify the new disease. Consequently, the diagnostic performance cannot always be robust, even with the gene selection algorithms. To solve the problem of very limited training dataset with non-linear data or outliers, we propose the method GVSG (Group Virtual Sample Generation), which is a non-linear Virtual Sample Generation algorithm. This non-linear method detects the characteristics in the very limited data, forms discrete groups of each discriminative gene, and systematically generates virtual samples for each of these to accelerate and stabilize the modeling process. The results show that this method significantly improves the learning accuracy in the early stage of DNA microarray data. (c) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:2740 / 2753
页数:14
相关论文
共 50 条
  • [31] A muti-SVMs design for cancer diagnosis using DNA microarray data
    Yang, Jinglin
    Xu, Yongli
    Li, Hanxiong
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 2241 - +
  • [32] A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer
    Emmert-Streib, Frank
    Dehmer, Matthias
    Liu, Jing
    Muehlhaeuser, Max
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 8, 2005, 8 : 82 - 87
  • [33] Identification of prognostic signatures in breast cancer microarray data using Bayesian techniques
    Carrivick, L.
    Rogers, S.
    Clark, J.
    Campbell, C.
    Girolami, M.
    Cooper, C.
    JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2006, 3 (08) : 367 - 381
  • [34] Identification of adriamycin resistance genes in breast cancer based on microarray data analysis
    Chen, Yan
    Lin, Yingfeng
    Cui, Zhaolei
    TRANSLATIONAL CANCER RESEARCH, 2020, 9 (12) : 7486 - 7494
  • [35] Identification of molecular characteristics induced by radiotherapy in rectal cancer based on microarray data
    Ge, Chang
    Wu, Mengxia
    Chen, Guifang
    Yu, Guanying
    Ji, Dehui
    Wang, Shaozhao
    ONCOLOGY LETTERS, 2017, 13 (04) : 2777 - 2783
  • [36] Prediction of Early Breast Cancer Metastasis from DNA Microarray Data Using High-Dimensional Cox Regression Models
    Zemmour, Christophe
    Bertucci, Francois
    Finetti, Pascal
    Chetrit, Bernard
    Birnbaum, Daniel
    Filleron, Thomas
    Boher, Jean-Marie
    CANCER INFORMATICS, 2015, 14 : 129 - 138
  • [37] RETRACTED ARTICLE: Identification of featured biomarkers in different types of lung cancer with DNA microarray
    Chao Zhou
    Hao Chen
    Li Han
    An Wang
    Liang-an Chen
    Molecular Biology Reports, 2014, 41 : 6357 - 6363
  • [38] Retraction Note to: Identification of featured biomarkers in different types of lung cancer with DNA microarray
    Chao Zhou
    Hao Chen
    Li Han
    An Wang
    Liang-an Chen
    Molecular Biology Reports, 2015, 42 : 1481 - 1481
  • [39] DNA Microarray Expression Profiling of Bladder Cancer Allows Identification of Noninvasive Diagnostic Markers
    Mengual, Lourdes
    Burset, Moises
    Ars, Elisabet
    Jose Lozano, Juan
    Villavicencio, Humberto
    Jose Ribal, Maria
    Alcaraz, Antonio
    JOURNAL OF UROLOGY, 2009, 182 (02): : 741 - 748
  • [40] Comparison of Nanostring nCounter® Data on FFPE Colon Cancer Samples and Affymetrix Microarray Data on Matched Frozen Tissues
    Chen, Xi
    Deane, Natasha G.
    Lewis, Keeli B.
    Li, Jiang
    Zhu, Jing
    Washington, M. Kay
    Beauchamp, R. Daniel
    PLOS ONE, 2016, 11 (05):