Clustering on Sparse Data in Non-Overlapping Feature Space with Applications to Cancer Subtyping

被引:2
|
作者
Kang, Tianyu [1 ]
Zarringhalam, Kourosh [2 ]
Kuijjer, Marieke [3 ]
Chen, Ping [1 ]
Quackenbush, John [3 ]
Ding, Wei [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
[2] Univ Massachusetts, Dept Math, Boston, MA 02125 USA
[3] Dana Farber Canc Inst, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Unsupervised Learning; Clustering; Artificial Neural Networks;
D O I
10.1109/ICDM.2018.00138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new algorithm, Reinforced and Informed Network-based Clustering (RINC), for finding unknown groups of similar data objects in sparse and largely non-overlapping feature space where a network structure among features can be observed. Sparse and non-overlapping unlabeled data become increasingly common and available especially in text mining and biomedical data mining. RINC inserts a domain informed model into a modelless neural network. In particular, our approach integrates physically meaningful feature dependencies into the neural network architecture and soft computational constraint. Our learning algorithm efficiently clusters sparse data through integrated smoothing and sparse auto-encoder learning. The informed design requires fewer samples for training and at least part of the model becomes explainable. The architecture of the reinforced network layers smooths sparse data over the network dependency in the feature space. Most importantly, through back-propagation, the weights of the reinforced smoothing layers are simultaneously constrained by the remaining sparse auto-encoder layers that set the target values to be equal to the raw inputs. Empirical results demonstrate that RINC achieves improved accuracy and renders physically meaningful clustering results.
引用
收藏
页码:1079 / 1084
页数:6
相关论文
共 50 条
  • [41] PuzzleNet: Boundary-Aware Feature Matching for Non-Overlapping 3D Point Clouds Assembly
    Hao-Yu Liu
    Jian-Wei Guo
    Hai-Yong Jiang
    Yan-Chao Liu
    Xiao-Peng Zhang
    Dong-Ming Yan
    Journal of Computer Science and Technology, 2023, 38 : 492 - 509
  • [42] Recovering non-overlapping network topology using far-field vehicle tracking data
    Niu, Chaowei
    Grimson, Eric
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 944 - +
  • [43] PuzzleNet: Boundary-Aware Feature Matching for Non-Overlapping 3D Point Clouds Assembly
    Liu, Hao-Yu
    Guo, Jian-Wei
    Jiang, Hai-Yong
    Liu, Yan-Chao
    Zhang, Xiao-Peng
    Yan, Dong-Ming
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (03) : 492 - 509
  • [44] Robust feature learning using contractive autoencoders for multi-omics clustering in cancer subtyping
    Guo, Mengke
    Ye, Xiucai
    Huang, Dong
    Sakurai, Tetsuya
    METHODS, 2025, 233 : 52 - 60
  • [45] A non-overlapping optimized Schwarz method for the heat equation with non linear boundary conditions and with applications to de-icing
    Bennani, Lokman
    Trontin, Pierre
    Chauvin, Remi
    Villedieu, Philippe
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2020, 80 (06) : 1500 - 1522
  • [46] Non-overlapping and non-cell-type-specific gene expression signatures predict lung cancer survival
    Sun, Zhifu
    Wigle, Dennis A.
    Yang, Ping
    JOURNAL OF CLINICAL ONCOLOGY, 2008, 26 (06) : 877 - 883
  • [47] A Feature Extraction Based Ensemble Data Clustering for Healthcare Applications
    Karthika, D.
    Jayashri, N.
    PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2022, 2023, 475 : 1 - 7
  • [48] Automatic, fast, hierarchical, and non-overlapping gating of flow cytometric data with flowEMMi v2
    Bruckmann, Carmen
    Mueller, Susann
    zu Siederdissen, Christian Hoener
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 : 6473 - 6489
  • [49] User group-enhanced user feature distribution transfer framework for non-overlapping cross-domain recommendations
    Gao, Xiaoying
    Ding, Ling
    Chen, Jianting
    Yang, Yunxiao
    Xiang, Yang
    KNOWLEDGE-BASED SYSTEMS, 2025, 314
  • [50] Modeling inter-camera space-time and appearance relationships for tracking across non-overlapping views
    Javed, Omar
    Shafique, Khurram
    Rasheed, Zeeshan
    Shah, Mubarak
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 109 (02) : 146 - 162