Biclustering of gene expression data by non-smooth non-negative matrix factorization

被引:103
|
作者
Carmona-Saez, P
Pascual-Marqui, RD
Tirado, F
Carazo, JM
Pascual-Montano, A [1 ]
机构
[1] Univ Complutense Madrid, Fac Ciencias Fis, Comp Architecture Dept, E-28040 Madrid, Spain
[2] Natl Biotechnol Ctr, BioComp Unit, Madrid 28049, Spain
[3] Univ Hosp Psychiat, KEY Inst Brain Mind Res, CH-8029 Zurich, Switzerland
关键词
D O I
10.1186/1471-2105-7-78
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The extended use of microarray technologies has enabled the generation and accumulation of gene expression datasets that contain expression levels of thousands of genes across tens or hundreds of different experimental conditions. One of the major challenges in the analysis of such datasets is to discover local structures composed by sets of genes that show coherent expression patterns across subsets of experimental conditions. These patterns may provide clues about the main biological processes associated to different physiological states. Results: In this work we present a methodology able to cluster genes and conditions highly related in sub-portions of the data. Our approach is based on a new data mining technique, Non-smooth Non-Negative Matrix Factorization (nsNMF), able to identify localized patterns in large datasets. We assessed the potential of this methodology analyzing several synthetic datasets as well as two large and heterogeneous sets of gene expression profiles. In all cases the method was able to identify localized features related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The uncovered structures showed a clear biological meaning in terms of relationships among functional annotations of genes and the phenotypes or physiological states of the associated conditions. Conclusion: The proposed approach can be a useful tool to analyze large and heterogeneous gene expression datasets. The method is able to identify complex relationships among genes and conditions that are difficult to identify by standard clustering algorithms.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Biclustering of gene expression data by non-smooth non-negative matrix factorization
    Pedro Carmona-Saez
    Roberto D Pascual-Marqui
    F Tirado
    Jose M Carazo
    Alberto Pascual-Montano
    [J]. BMC Bioinformatics, 7
  • [2] Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition
    Sabzalian, B.
    Abolghasemi, V
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING, 2018, 31 (10): : 1698 - 1707
  • [3] Gene Expression Data Classification Based on Non-negative Matrix Factorization
    Zheng, Chun-Hou
    Zhang, Ping
    Zhang, Lei
    Liu, Xin-Xin
    Han, Ju
    [J]. IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 194 - +
  • [4] A Framework for Regularized Non-Negative Matrix Factorization, with Application to the Analysis of Gene Expression Data
    Taslaman, Leo
    Nilsson, Bjorn
    [J]. PLOS ONE, 2012, 7 (11):
  • [5] Hessian Regularization Based Non-negative Matrix Factorization for Gene Expression Data Clustering
    Liu, Xiao
    Shi, Jun
    Wang, Congzhi
    [J]. 2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 4130 - 4133
  • [6] Non-negative Matrix Factorization for Binary Data
    Larsen, Jacob Sogaard
    Clemmensen, Line Katrine Harder
    [J]. 2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 555 - 563
  • [7] Tumor Classification Based on Non-Negative Matrix Factorization Using Gene Expression Data
    Zheng, Chun-Hou
    Ng, To-Yee
    Zhang, Lei
    Shiu, Chi-Keung
    Wang, Hong-Qiang
    [J]. IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2011, 10 (02) : 86 - 93
  • [8] IMPROVED NON-NEGATIVE FACTORIZATION IN THE ANALYSIS OF GENE EXPRESSION DATA
    Zhang, Jin
    Wang, Jiajun
    [J]. 2008 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 163 - 167
  • [9] Gene Expression Analysis through Parallel Non-Negative Matrix Factorization
    Alejandra Serrano-Rubio, Angelica
    Morales-Luna, Guillermo B.
    Meneses-Viveros, Amilcar
    [J]. COMPUTATION, 2021, 9 (10)
  • [10] Graph Regularized Lp Smooth Non-negative Matrix Factorization for Data Representation
    Chengcai Leng
    Hai Zhang
    Guorong Cai
    Irene Cheng
    Anup Basu
    [J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (02) : 584 - 595