Clustering on Sparse Data in Non-Overlapping Feature Space with Applications to Cancer Subtyping

被引:2
|
作者
Kang, Tianyu [1 ]
Zarringhalam, Kourosh [2 ]
Kuijjer, Marieke [3 ]
Chen, Ping [1 ]
Quackenbush, John [3 ]
Ding, Wei [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
[2] Univ Massachusetts, Dept Math, Boston, MA 02125 USA
[3] Dana Farber Canc Inst, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Unsupervised Learning; Clustering; Artificial Neural Networks;
D O I
10.1109/ICDM.2018.00138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new algorithm, Reinforced and Informed Network-based Clustering (RINC), for finding unknown groups of similar data objects in sparse and largely non-overlapping feature space where a network structure among features can be observed. Sparse and non-overlapping unlabeled data become increasingly common and available especially in text mining and biomedical data mining. RINC inserts a domain informed model into a modelless neural network. In particular, our approach integrates physically meaningful feature dependencies into the neural network architecture and soft computational constraint. Our learning algorithm efficiently clusters sparse data through integrated smoothing and sparse auto-encoder learning. The informed design requires fewer samples for training and at least part of the model becomes explainable. The architecture of the reinforced network layers smooths sparse data over the network dependency in the feature space. Most importantly, through back-propagation, the weights of the reinforced smoothing layers are simultaneously constrained by the remaining sparse auto-encoder layers that set the target values to be equal to the raw inputs. Empirical results demonstrate that RINC achieves improved accuracy and renders physically meaningful clustering results.
引用
收藏
页码:1079 / 1084
页数:6
相关论文
共 50 条
  • [1] Feature correspondence in a non-overlapping camera network
    Zong Jie Xiang
    Qiren Chen
    Yuncai Liu
    Multimedia Tools and Applications, 2014, 73 : 1129 - 1145
  • [2] Feature correspondence in a non-overlapping camera network
    Xiang, Zong Jie
    Chen, Qiren
    Liu, Yuncai
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 73 (03) : 1129 - 1145
  • [3] Identification of overlapping and non-overlapping community structure by fuzzy clustering in complex networks
    Sun, Peng Gang
    Gao, Lin
    Han, Shan Shan
    INFORMATION SCIENCES, 2011, 181 (06) : 1060 - 1071
  • [4] A sparse dynamic programming algorithm for alignment with non-overlapping inversions
    do Lago, AP
    Muchnik, I
    Kulikowski, C
    RAIRO-THEORETICAL INFORMATICS AND APPLICATIONS, 2005, 39 (01): : 175 - 189
  • [5] Overlapping Community Detection Based on Strong Tie Detection and Non-Overlapping Link Clustering
    Guo, Lin
    Zhang, Miao
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [6] Constant Size Point Cloud Clustering: A Compact, Non-Overlapping Solution
    Guarda, Andre F. R.
    Rodrigues, Nuno M. M.
    Pereira, Fernando
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 77 - 91
  • [7] Non-overlapping Indexing in BWT-Runs Bounded Space
    Gibney, Daniel
    Macnichol, Paul
    Thankachan, Sharma V.
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2023, 2023, 14240 : 260 - 270
  • [8] Clustering Sparse Data With Feature Correlation With Application to Discover Subtypes in Cancer
    Qiang, Jipeng
    Ding, Wei
    Kuijjer, Marieke
    Quackenbush, John
    Chen, Ping
    IEEE ACCESS, 2020, 8 : 67775 - 67789
  • [9] Label propagation based evolutionary clustering for detecting overlapping and non-overlapping communities in dynamic networks
    Liu, Ke
    Huang, Jianbin
    Sun, Heli
    Wan, Mengjie
    Qi, Yutao
    Li, He
    KNOWLEDGE-BASED SYSTEMS, 2015, 89 : 487 - 496
  • [10] Appearance Feature Based Human Correspondence under Non-overlapping Views
    Chae, Hyun-Uk
    Jo, Kang-Hyun
    EMERGING INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, 5754 : 635 - 644