Multi-view based integrative analysis of gene expression data for identifying biomarkers

被引:9
|
作者
Yang, Zi-Yi [1 ,2 ]
Liu, Xiao-Ying [3 ]
Shu, Jun [4 ]
Zhang, Hui [1 ,2 ]
Ren, Yan-Qiong [1 ,2 ]
Xu, Zong-Ben [4 ]
Liang, Yong [1 ,2 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Taipa 999078, Macao, Peoples R China
[2] Macau Univ Sci & Technol, State Key Lab Qual Res Chinese Med, Taipa 999078, Macao, Peoples R China
[3] Guangdong Polytech Sci & Technol, Comp Engn Tech Coll, Zhuhai 519090, Peoples R China
[4] Xi An Jiao Tong Univ, Sch Math & Stat, Minist Educ, Key Lab Intelligent Networks & Network Secur, Xian 710049, Shaanxi, Peoples R China
关键词
VARIABLE SELECTION; MICROARRAY; METAANALYSIS; REGULARIZATION; PROFILES;
D O I
10.1038/s41598-019-49967-4
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The widespread applications in microarray technology have produced the vast quantity of publicly available gene expression datasets. However, analysis of gene expression data using biostatistics and machine learning approaches is a challenging task due to (1) high noise; (2) small sample size with high dimensionality; (3) batch effects and (4) low reproducibility of significant biomarkers. These issues reveal the complexity of gene expression data, thus significantly obstructing microarray technology in clinical applications. The integrative analysis offers an opportunity to address these issues and provides a more comprehensive understanding of the biological systems, but current methods have several limitations. This work leverages state of the art machine learning development for multiple gene expression datasets integration, classification and identification of significant biomarkers. We design a novel integrative framework, MVIAm - Multi-View based Integrative Analysis of microarray data for identifying biomarkers. It applies multiple cross-platform normalization methods to aggregate multiple datasets into a multi-view dataset and utilizes a robust learning mechanism Multi-View Self-Paced Learning (MVSPL) for gene selection in cancer classification problems. We demonstrate the capabilities of MVIAm using simulated data and studies of breast cancer and lung cancer, it can be applied flexibly and is an effective tool for facing the four challenges of gene expression data analysis. Our proposed model makes microarray integrative analysis more systematic and expands its range of applications.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multi-view based integrative analysis of gene expression data for identifying biomarkers
    Zi-Yi Yang
    Xiao-Ying Liu
    Jun Shu
    Hui Zhang
    Yan-Qiong Ren
    Zong-Ben Xu
    Yong Liang
    [J]. Scientific Reports, 9
  • [2] Robust integrative biclustering for multi-view data
    Zhang, Weijie
    Wendt, Christine
    Bowler, Russel
    Hersh, Craig P.
    Safo, Sandra E.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (11) : 2201 - 2216
  • [3] Structural learning and integrative decomposition of multi-view data
    Gaynanova, Irina
    Li, Gen
    [J]. BIOMETRICS, 2019, 75 (04) : 1121 - 1132
  • [4] Identifying the potential miRNA biomarkers based on multi-view networks and reinforcement learning for diseases
    Su, Benzhe
    Wang, Weiwei
    Lin, Xiaohui
    Liu, Shenglan
    Huang, Xin
    [J]. BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [5] Multi-View Kernel-based Data Analysis
    Averbuch, Amir
    Salhov, Moshe
    Lindenbaum, Ofir
    Silberschatz, Avi
    Shkolnisky, Yoe
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON THE SCIENCE OF ELECTRICAL ENGINEERING (ICSEE), 2016,
  • [6] Multi-view feature selection for identifying gene markers: a diversified biological data driven approach
    Sudipta Acharya
    Laizhong Cui
    Yi Pan
    [J]. BMC Bioinformatics, 21
  • [7] Multi-View Gene Clustering using Gene Ontology and Expression-based Similarities
    Giri, Swagarika Jaharlal
    Saha, Sriparna
    [J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [8] Multi-view feature selection for identifying gene markers: a diversified biological data driven approach
    Acharya, Sudipta
    Cui, Laizhong
    Pan, Yi
    [J]. BMC BIOINFORMATICS, 2020, 21 (Suppl 18)
  • [9] Multi-view kernel consensus for data analysis
    Salhov, Moshe
    Lindenbaum, Ofir
    Aizenbud, Yariv
    Silberschatz, Avi
    Shkolnisky, Yoel
    Averbuch, Amir
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 49 (01) : 208 - 228
  • [10] Integrative multi-platform meta-analysis of hepatocellular carcinoma gene expression profiles for identifying prognostic and diagnostic biomarkers
    Gholizadeh, Maryam
    Hadizadeh, Morteza
    Mazlooman, Seyed Reza
    Eslami, Saeid
    Raoufi, Sadegh
    Farsimadan, Marziye
    Rashidifar, Maryam
    Drozdzik, Marek
    Mehrabani, Mehrnaz
    [J]. GENES & DISEASES, 2023, 10 (04) : 1194 - 1196