Knowledge-based gene expression classification via matrix factorization

被引:27
|
作者
Schachtner, R. [1 ]
Lutter, D. [1 ,2 ,3 ]
Knollmueller, P. [1 ]
Tome, A. M. [4 ]
Theis, F. J. [1 ,2 ]
Schmitz, G. [3 ]
Stetter, M. [5 ]
Vilda, P. Gomez [6 ]
Lang, E. W. [1 ]
机构
[1] Univ Regensburg, CIML Biophys, D-93040 Regensburg, Germany
[2] GSF Munich, CMB IBI, Munich, Germany
[3] Univ Hosp Regensburg, D-93042 Regensburg, Germany
[4] Univ Aveiro, IEETA DETI, P-3810193 Aveiro, Portugal
[5] Siemens AG, Siemens Corp Technol, D-8000 Munich, Germany
[6] Univ Politecn Madrid, DATSI FI, E-18500 Madrid, Spain
关键词
D O I
10.1093/bioinformatics/btn245
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Modern machine learning methods based on matrix decomposition techniques, like independent component analysis (ICA) or non-negative matrix factorization (NMF), provide new and efficient analysis tools which are currently explored to analyze gene expression profiles. These exploratory feature extraction techniques yield expression modes (ICA) or metagenes (NMF). These extracted features are considered indicative of underlying regulatory processes. They can as well be applied to the classification of gene expression datasets by grouping samples into different categories for diagnostic purposes or group genes into functional categories for further investigation of related metabolic pathways and regulatory networks. Results: In this study we focus on unsupervised matrix factorization techniques and apply ICA and sparse NMF to microarray datasets. The latter monitor the gene expression levels of human peripheral blood cells during differentiation from monocytes to macrophages. We show that these tools are able to identify relevant signatures in the deduced component matrices and extract informative sets of marker genes from these gene expression profiles. The methods rely on the joint discriminative power of a set of marker genes rather than on single marker genes. With these sets of marker genes, corroborated by leave-one-out or random forest cross-validation, the datasets could easily be classified into related diagnostic categories. The latter correspond to either monocytes versus macrophages or healthy vs Niemann Pick C disease patients.
引用
下载
收藏
页码:1688 / 1697
页数:10
相关论文
共 50 条
  • [1] A Matrix Factorization Classifier for Knowledge-Based Microarray Analysis
    Schachtner, R.
    Lutter, D.
    Tome, A. M.
    Schmitz, G.
    Gomez Vilda, P.
    Lang, E. W.
    2ND INTERNATIONAL WORKSHOP ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (IWPACBB 2008), 2009, 49 : 137 - +
  • [2] Matrix factorization-based improved classification of gene expression data
    Malik S.
    Bansal P.
    Recent Advances in Computer Science and Communications, 2020, 13 (05) : 858 - 863
  • [3] Gene Expression Data Classification Based on Non-negative Matrix Factorization
    Zheng, Chun-Hou
    Zhang, Ping
    Zhang, Lei
    Liu, Xin-Xin
    Han, Ju
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 194 - +
  • [4] Exploring matrix factorization techniques for classification of gene expression profiles
    Schachtner, R.
    Lutter, D.
    Tome, A. M.
    Lang, E. W.
    Gomez Vilda, P.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING, CONFERENCE PROCEEDINGS BOOK, 2007, : 303 - +
  • [5] On Knowledge-Based Gene Expression Data Analysis
    Arakelyan, Arsen
    Aslanyan, Levon
    Boyajyan, Anna
    2013 COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES (CSIT), 2013,
  • [6] Tumor Classification Based on Non-Negative Matrix Factorization Using Gene Expression Data
    Zheng, Chun-Hou
    Ng, To-Yee
    Zhang, Lei
    Shiu, Chi-Keung
    Wang, Hong-Qiang
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2011, 10 (02) : 86 - 93
  • [7] Towards knowledge-based gene expression data mining
    Bellazzi, Riccardo
    Zupan, Blaz
    JOURNAL OF BIOMEDICAL INFORMATICS, 2007, 40 (06) : 787 - 802
  • [8] Gene selection via matrix factorization
    Wang, Fei
    Li, Tao
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1046 - +
  • [9] Non-negative Matrix and Tensor Factorization Based Classification of Clinical Microarray Gene Expression Data
    Li, Yifeng
    Ngom, Alioune
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 438 - 443
  • [10] Nonlinear Knowledge-Based Classification
    Mangasarian, Olvi L.
    Wild, Edward W.
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (10): : 1826 - 1832