Row and Column Structure-Based Biclustering for Gene Expression Data

被引:2
|
作者
Qian, Subin [1 ,2 ]
Liu, Huiyi [1 ]
Yuan, Xiaofeng [2 ]
Wei, Wei [3 ]
Chen, Shuangshuang [2 ]
Yan, Hong [4 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing 210024, Peoples R China
[2] Yancheng Teachers Univ, Sch Informat Engn, Yancheng 224002, Peoples R China
[3] Anqing Normal Univ, Sch Teacher Educ, Anqing 246133, Peoples R China
[4] City Univ Hong Kong, Dept Elect Engn, Hong Kong 999077, Peoples R China
关键词
Gene expression; Complexity theory; Clustering algorithms; Greedy algorithms; Clustering methods; Computational biology; Bioinformatics; Biclustering; checkerboard pattern; row and column selection; MICROARRAY DATA; ALGORITHMS; PATTERNS;
D O I
10.1109/TCBB.2020.3022085
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Due to the development of high-throughput technologies for gene analysis, the biclustering method has attracted much attention. However, existing methods have problems with high time and space complexity. This paper proposes a biclustering method, called Row and Column Structure-based Biclustering (RCSBC), with low time and space complexity to find checkerboard patterns within microarray data. First, the paper describes the structure of bicluster by using the structure of rows and columns. Second, the paper chooses the representative rows and columns with two algorithms. Finally, the gene expression data are biclustered on the space spanned by representative rows and columns. To the best of our knowledge, this paper is the first to exploit the relationship between the row/column structure of a gene expression matrix and the structure of biclusters. Both the synthetic datasets and the real-life gene expression datasets are used to validate the effectiveness of our method. It can be seen from the experiment results that the RCSBC outperforms the state-of-the-art algorithms both on clustering accuracy and time/space complexity. This study offers new insights into biclustering the large-scale gene expression data without loading the whole data into memory.
引用
收藏
页码:1117 / 1129
页数:13
相关论文
共 50 条
  • [1] Biclustering of gene expression data using biclustering iterative signature algorithm and biclustering coherent column
    Kumar, E. Saravana
    Vengatesan, K.
    Singh, R. P.
    Rajan, C.
    [J]. INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2018, 26 (3-4) : 341 - 352
  • [2] runibic: a Bioconductor package for parallel row-based biclustering of gene expression data
    Orzechowski, Patryk
    Panszczyk, Artur
    Huang, Xiuzhen
    Moore, Jason H.
    [J]. BIOINFORMATICS, 2018, 34 (24) : 4302 - 4304
  • [3] UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data
    Zhenjia Wang
    Guojun Li
    Robert W. Robinson
    Xiuzhen Huang
    [J]. Scientific Reports, 6
  • [4] UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data
    Wang, Zhenjia
    Li, Guojun
    Robinson, Robert W.
    Huang, Xiuzhen
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [5] On Biclustering of Gene Expression Data
    Mounir, Mahmoud
    Hamdy, Mohamed
    [J]. 2015 IEEE SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INFORMATION SYSTEMS (ICICIS), 2015, : 641 - 648
  • [6] On Biclustering of Gene Expression Data
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    [J]. CURRENT BIOINFORMATICS, 2010, 5 (03) : 204 - 216
  • [7] Biclustering On Gene Expression Data
    Shruthi, M. P.
    [J]. 2017 INTERNATIONAL CONFERENCE ON ALGORITHMS, METHODOLOGY, MODELS AND APPLICATIONS IN EMERGING TECHNOLOGIES (ICAMMAET), 2017,
  • [8] Seed-Based Biclustering of Gene Expression Data
    An, Jiyuan
    Liew, Alan Wee-Chung
    Nelson, Colleen C.
    [J]. PLOS ONE, 2012, 7 (08):
  • [9] Bayesian biclustering of gene expression data
    Jiajun Gu
    Jun S Liu
    [J]. BMC Genomics, 9
  • [10] Biclustering in gene expression data by tendency
    Liu, JZ
    Yang, J
    Wang, W
    [J]. 2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2004, : 182 - 193