Machine Learning Clustering for Cancer Analysis Employing Gene Expression Data

被引:0
|
作者
Ospino, Camilo Andres Perez [1 ]
Rivera, Jorman Arbey Castro [1 ]
Orjuela-Canon, Alvaro D. [2 ]
机构
[1] Univ Rosario, Bogota, Colombia
[2] Univ Rosario, Sch Med & Hlth Sci, Bogota, Colombia
关键词
Pan-Cancer; K-means; Data base; Genomics; Clustering;
D O I
10.1109/COLCACI59285.2023.10226026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea that cancer types vary in their molecular structure (DNA, RNA, proteins, and epigenetics) depending on the origin and location of the cancer, has been worked on. The Cancer Genome Atlas (TCGA) has generated an initiative to carefully create a database to ensure quality data in the profiling of different tumors to promote research, a part of this large database was called Pan-Cancer, which has the genomic, epigenetic, transcriptional and proteomic profiling of 12 different types of cancer. In this research we took one of the profiling, RNA profiling, in 5 cancer types, in order to determine the possibility of segmenting in an unsupervised manner and to evaluate the difference of them by their origin. The results indicate that the number of clusters can vary from 5 to 7, with 5 clusters being established by the database labels, however, the division of 6 or 7 clusters is due to the clustering of breast cancer (BRCA) which has several origins.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] MINER: exploratory analysis of gene interaction networks by machine learning from expression data
    Sidath Randeni Kadupitige
    Kin Chun Leung
    Julia Sellmeier
    Jane Sivieng
    Daniel R Catchpoole
    Michael E Bain
    Bruno A Gaëta
    BMC Genomics, 10
  • [32] MINER: exploratory analysis of gene interaction networks by machine learning from expression data
    Kadupitige, Sidath Randeni
    Leung, Kin Chun
    Sellmeier, Julia
    Sivieng, Jane
    Catchpoole, Daniel R.
    Bain, Michael E.
    Gaeta, Bruno A.
    BMC GENOMICS, 2009, 10
  • [33] Machine Learning for Childhood Acute Lymphoblastic Leukaemia Gene Expression Data Analysis: A Review
    Chaiboonchoe, Amphun
    Samarasinghe, Sandhya
    Kulasiri, Don
    CURRENT BIOINFORMATICS, 2010, 5 (02) : 118 - 133
  • [34] Coupled two-way clustering analysis of breast cancer and colon cancer gene expression data
    Getz, G
    Gal, H
    Kela, I
    Notterman, DA
    Domany, E
    BIOINFORMATICS, 2003, 19 (09) : 1079 - 1089
  • [35] Integrating Gene Ontology Based Grouping and Ranking into the Machine Learning Algorithm for Gene Expression Data Analysis
    Yousef, Malik
    Sayici, Ahmet
    Bakir-Gungor, Burcu
    DATABASE AND EXPERT SYSTEMS APPLICATIONS - DEXA 2021 WORKSHOPS, 2021, 1479 : 205 - 214
  • [36] Machine Learning and Personalized Modeling Based Gene Selection for Acute GvHD Gene Expression Data Analysis
    Fiasche, Maurizio
    Cuzzola, Maria
    Fedele, Roberta
    Iacopino, Pasquale
    Morabito, Francesco C.
    ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT I, 2010, 6352 : 217 - +
  • [37] Clustering analysis of microarray gene expression data with new clustering ensemble method
    Luo, Fei
    Liu, Juan
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 500 - 504
  • [38] Machine learning analysis of TCGA cancer data
    Liñares-Blanco J.
    Pazos A.
    Fernandez-Lozano C.
    PeerJ Computer Science, 2021, 7 : 1 - 47
  • [39] Prediction of colorectal cancer chemotherapy efficacy using machine learning applied to gene expression data
    Jafri, Mohsin Saleet
    Amniouel, Soukaina
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [40] Breast cancer prediction based on gene expression data using interpretable machine learning techniques
    Kallah-Dagadu, Gabriel
    Mohammed, Mohanad
    Nasejje, Justine B.
    Mchunu, Nobuhle Nokubonga
    Twabi, Halima S.
    Batidzirai, Jesca Mercy
    Singini, Geoffrey Chiyuzga
    Nevhungoni, Portia
    Maposa, Innocent
    SCIENTIFIC REPORTS, 2025, 15 (01):