Classification of miRNA Expression Data Using Random Forests for Cancer Diagnosis

被引:7
|
作者
Razak, Eliza [1 ]
Yusorf, Faridah [1 ]
Raus, Raha Ahmad [1 ]
机构
[1] Int Islamic Univ Malaysia, Kuala Lumpur, Malaysia
关键词
miRNA; cancer; random forest; classification; MICRORNA; BIOMARKERS;
D O I
10.1109/ICCCE.2016.49
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Cancer is a major leading cause of death and responsible for around 13% of all deaths world-wide. Cancer incidence rate is growing at an alarming rate in Malaysia and the world as we know it. It is estimated that statistically one out of every four Malaysians will develop cancer by the age of 75. Conventional methods of diagnosing cancer rely solely on skilled physicians, with the help of medical imaging, to detect certain symptoms which usually appear in the late stage of cancer. Furthermore, biopsy examinations are highly invasive since tissue samples are required to be extracted from patients. There exist minimally invasive cancer biomarkers in forms of proteins from serum. Nevertheless, existing protein-based diagnosis techniques require labor-intensive analysis compounded by low diagnosis sensitivity. There have indeed been a number of studies to identify novel miRNA-based cancer biomarkers. However, the existing diagnosis techniques using miRNA suffer from low diagnosis accuracy, sensitivity, and specificity. The low diagnosis accuracy and sensitivity of the existing techniques stems from the fact that there is extremely low miRNA count in body fluids. There is also an inevitable problem of cross contamination between cells and exosomes in sample preparation steps. This paper proposes to circumvent these problems in data analysis stage with a machine learning technique called Random Forest. The proposed system achieved 93.48 % accuracy for gastric cancer and 100 % accuracy for ovarian cancer. The results are promising and encouraging. Despite much noise contaminated the sample preparation process and low miRNA count in body fluids, the proposed system able to identify miRNA markers responsible for classification of cancer.
引用
收藏
页码:187 / 190
页数:4
相关论文
共 50 条
  • [1] Classification of Immunosignature Using Random Forests for Cancer Diagnosis
    Zarzar, Mouayad
    Razak, Eliza
    Htike, Zaw Zaw
    Yusof, Faridah
    ADVANCED SCIENCE LETTERS, 2015, 21 (11) : 3449 - 3452
  • [2] Cautious Classification with Data Missing Not at Random Using Generative Random Forests
    Llerena, Julissa Villanueva
    Maua, Denis Deratani
    Antonucci, Alessandro
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2021, 2021, 12897 : 284 - 298
  • [3] Big Genome Data Classification with Random Forests Using VariantSpark
    Devi, A. Shobana
    Maragatham, G.
    INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES (ICCNCT 2018), 2019, 15 : 599 - 614
  • [4] Classification of Urban LiDAR data using Conditional Random Field and Random Forests
    Niemeyer, Joachim
    Rottensteiner, Franz
    Soergel, Uwe
    2013 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2013, : 139 - 142
  • [5] AUTOMATIC FUSION AND CLASSIFICATION OF HYPERSPECTRAL AND LIDAR DATA USING RANDOM FORESTS
    Merentitis, Andreas
    Debes, Christian
    Heremans, Roel
    Frangiadakis, Nikolaos
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014, : 1245 - 1248
  • [6] Soil Classification and Feature Importance of EPBM Data Using Random Forests
    Apoji, Dayu
    Fujita, Yuji
    Soga, Kenichi
    GEO-CONGRESS 2022: DEEP FOUNDATIONS, EARTH RETENTION, AND UNDERGROUND CONSTRUCTION, 2022, 332 : 520 - 528
  • [7] Classification Using Streaming Random Forests
    Abdulsalam, Hanady
    Skillicorn, David B.
    Martin, Patrick
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (01) : 22 - 36
  • [8] Classification of DNA microarray data with random forests
    Stokowy T.
    Advances in Intelligent and Soft Computing, 2010, 69 : 305 - 308
  • [9] Using miRNA expression data for the study of human cancer
    Mascellani, N.
    Tagliavini, L.
    Gamberoni, G.
    Rossi, S.
    Marchesini, J.
    Taccioli, C.
    Di Leva, G.
    Negrini, M.
    Croce, C.
    Volinia, S.
    MINERVA BIOTECNOLOGICA, 2008, 20 (01) : 23 - 30
  • [10] Data Calibration Based on Multisensor Using Classification Analysis: A Random Forests Approach
    Xing, Xue
    Yu, Dexin
    Zhang, Wei
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015