Performing Data Mining And Integrative Analysis Of Biomarker In Breast Cancer Using Multiple Publicly Accessible Databases

被引:2
|
作者
Chen, Min-na [1 ]
Zeng, De [2 ]
Zheng, Zhuo-qun [3 ]
Li, Zheng [3 ]
Wu, Jian-le [3 ]
Jin, Jun-yu [3 ]
Wang, He-jia [3 ]
Huang, Cui-zhen [1 ]
Lin, Hao-yu [1 ]
机构
[1] Shantou Univ, Med Coll, Affiliated Hosp 1, Dept Thyroid & Breast Surg, Shantou, Peoples R China
[2] Shantou Univ, Med Coll, Canc Hosp, Dept Med Oncol, Shantou, Peoples R China
[3] Shantou Univ, Med Coll, Shantou, Peoples R China
来源
JOVE-JOURNAL OF VISUALIZED EXPERIMENTS | 2019年 / 147期
基金
中国国家自然科学基金;
关键词
Cancer Research; Issue; 147; Breast cancer; Biomarker; Database; Data mining; Prognosis; Bioinformation; HUMAN PROTEIN ATLAS; EXPRESSION; FAMILY; RESOURCE;
D O I
10.3791/59238
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In recent years, emerging databases were designed to lower the barriers for approaching the intricate cancer genomic datasets, thereby, facilitating investigators to analyze and interpret genes, samples and clinical data across different types of cancer. Herein, we describe a practical operation procedure, taking ID1 (Inhibitor of DNA binding proteins 1) as an example, to characterize the expression patterns of biomarker and survival predictors of breast cancer based on pooled clinical datasets derived from online accessible databases, including ONCOMINE, bcGenExMiner v4.0 (Breast cancer gene-expression miner v4.0), GOBO (Gene expression-based Outcome for Breast cancer Online), HPA (The human protein atlas), and Kaplan-Meier plotter. The analysis began with querying the expression pattern of the gene of interest (e. g., ID1) in cancerous samples vs. normal samples. Then, the correlation analysis between ID1 and clinicopathological characteristics in breast cancer was performed. Next, the expression profiles of ID1 was stratified according to different subgroups. Finally, the association between ID1 expression and survival outcome was analyzed. The operation procedure simplifies the concept to integrate multidimensional data types at the gene level from different databases and test hypotheses regarding recurrence and genomic context of gene alteration events in breast cancer. This method can improve the credibility and representativeness of the conclusions, thereby, present informative perspective on a gene of interest.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Analysis of breast cancer using data mining & statistical techniques
    Xiong, XC
    Kim, YO
    Baek, YC
    Rhee, DW
    Kim, SH
    Sixth International Conference on Software Engineerng, Artificial Intelligence, Networking and Parallel/Distributed Computing and First AICS International Workshop on Self-Assembling Wireless Networks, Proceedings, 2005, : 82 - 87
  • [2] A Survey on Breast Cancer Analysis Using Data Mining Techniques
    Padmapriya, B.
    Velmurugan, T.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 1234 - 1237
  • [3] Integrative transcriptome data mining for identification of core lncRNAs in breast cancer
    Zhang, Xiaoming
    Zhuang, Jing
    Liu, Lijuan
    He, Zhengguo
    Liu, Cun
    Ma, Xiaoran
    Li, Jie
    Ding, Xia
    Sun, Changgang
    PEERJ, 2019, 7
  • [4] ROCK: a resource for integrative breast cancer data analysis
    Saif Ur-Rehman
    Qiong Gao
    Costas Mitsopoulos
    Marketa Zvelebil
    Breast Cancer Research and Treatment, 2013, 139 : 907 - 921
  • [5] ROCK: a resource for integrative breast cancer data analysis
    Ur-Rehman, Saif
    Gao, Qiong
    Mitsopoulos, Costas
    Zvelebil, Marketa
    BREAST CANCER RESEARCH AND TREATMENT, 2013, 139 (03) : 907 - 921
  • [6] Integrative Analysis of Prognosis Data on Multiple Cancer Subtypes
    Liu, Jin
    Huang, Jian
    Zhang, Yawei
    Lan, Qing
    Rothman, Nathaniel
    Zheng, Tongzhang
    Ma, Shuangge
    BIOMETRICS, 2014, 70 (03) : 480 - 488
  • [7] An Analysis of the Survivability in SEER Breast Cancer Data Using Association Rule Mining
    Li, Fangfang
    Duan, Yu
    SECURITY, PRIVACY AND ANONYMITY IN COMPUTATION, COMMUNICATION AND STORAGE, (SPACCS 2016), 2016, 0067 : 184 - 194
  • [8] Integrative Analysis of Cancer Prognosis Data With Multiple Subtypes Using Regularized Gradient Descent
    Ma, Shuangge
    Zhang, Yawei
    Huang, Jian
    Huang, Yuan
    Lan, Qing
    Rothman, Nathaniel
    Zheng, Tongzhang
    GENETIC EPIDEMIOLOGY, 2012, 36 (08) : 829 - 838
  • [9] A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA Toolbox
    Alshammari, Majdah
    Mezher, Mohammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (08) : 224 - 229
  • [10] Data mining from multiple heterogeneous relational databases using decision tree classification
    Mehenni, Tahar
    Moussaoui, Abdelouahab
    PATTERN RECOGNITION LETTERS, 2012, 33 (13) : 1768 - 1775