A joint optimization framework integrated with biological knowledge for clustering incomplete gene expression data

被引:2
|
作者
Li, Dan [1 ]
Gu, Hong [1 ]
Chang, Qiaozhen [1 ]
Wang, Jia [2 ]
Qin, Pan [1 ]
机构
[1] Dalian Univ Technol, Fac Elect Informat & Elect Engn, Dalian 116024, Peoples R China
[2] Dalian Med Univ, Dept Breast Surg, Hosp 2, Dalian 116023, Peoples R China
关键词
Gene clustering; Joint optimization; Multi-objective clustering; Imputation; Gene ontology; MISSING VALUE ESTIMATION; VALIDITY MEASURE; ALGORITHM; REPRODUCIBILITY; IMPUTATION; SELECTION; SEARCH;
D O I
10.1007/s00500-022-07180-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering algorithms have been successfully applied to identify co-expressed gene groups from gene expression data. Missing values often occur in gene expression data, which presents a challenge for gene clustering. When partitioning incomplete gene expression data into co-expressed gene groups, missing value imputation and clustering are generally performed as two separate processes. These two-stage methods are likely to result in unsuitable imputation values for clustering task and unsatisfying clustering performance. This paper proposes a multi-objective joint optimization framework for clustering incomplete gene expression data that addresses this problem. The proposed framework can impute the missing expression values under the guidance of clustering, and therefore realize the synergistic improvement of imputation and clustering. In addition, gene expression similarity and gene semantic similarity extracted from the Gene Ontology are combined, as the form of functional neighbor interval for each missing expression value, to provide reasonable constraints for the joint optimization framework. The experiments are carried out on several benchmark data sets. In terms of the average improvement rate over the data sets and different missing rates, our framework can reduce the imputation error by 6.4-14.7% and increase the clustering accuracy by 4.0-10.1% compared with six popular and promising methods. Furthermore, biological significance of the identified gene clusters is reported to evaluate the effectiveness of the proposed framework.
引用
收藏
页码:13639 / 13656
页数:18
相关论文
共 50 条
  • [11] Evaluation and optimization of clustering in gene expression data analysis
    Famili, AF
    Liu, GM
    Liu, ZY
    [J]. BIOINFORMATICS, 2004, 20 (10) : 1535 - 1545
  • [12] Gene expression data clustering and visualization based on a binary hierarchical clustering framework
    Szeto, LK
    Liew, AWC
    Yan, H
    Tang, SS
    [J]. JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2003, 14 (04): : 341 - 362
  • [13] Feature Selection and Clustering of Gene Expression Profiles Using Biological Knowledge
    Mitra, Sushmita
    Ghosh, Sampreeti
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06): : 1590 - 1599
  • [14] A New Framework for Co-clustering of Gene Expression Data
    Zhang, Shuzhong
    Wang, Kun
    Chen, Bilian
    Huang, Xiuzhen
    [J]. PATTERN RECOGNITION IN BIOINFORMATICS, 2011, 7036 : 1 - +
  • [15] Clustering gene expression data using a diffraction‐inspired framework
    Steven C Dinger
    Michael A Van Wyk
    Sergio Carmona
    David M Rubin
    [J]. BioMedical Engineering OnLine, 11
  • [16] Gene-Expression Data Semi-Supervised Clustering in Multi-Objective Optimization Framework
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 1081 - 1086
  • [17] Algorithm for Clustering Analysis of Gene Expression Data using MapReduce Framework
    Priya, P. Packia Amutha
    Lawrance, R.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGIES AND INTELLIGENT DATA ENGINEERING (ICCTIDE'16), 2016,
  • [18] On biological validity indices for soft clustering algorithms for gene expression data
    Wu, Han-Ming
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (05) : 1969 - 1979
  • [19] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Wei-Po Lee
    Chung-Hsun Lin
    [J]. Cognitive Computation, 2016, 8 : 217 - 227
  • [20] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Lee, Wei-Po
    Lin, Chung-Hsun
    [J]. COGNITIVE COMPUTATION, 2016, 8 (02) : 217 - 227