Clustering on Sparse Data in Non-Overlapping Feature Space with Applications to Cancer Subtyping

被引:2
|
作者
Kang, Tianyu [1 ]
Zarringhalam, Kourosh [2 ]
Kuijjer, Marieke [3 ]
Chen, Ping [1 ]
Quackenbush, John [3 ]
Ding, Wei [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Boston, MA 02125 USA
[2] Univ Massachusetts, Dept Math, Boston, MA 02125 USA
[3] Dana Farber Canc Inst, Boston, MA 02115 USA
基金
美国国家科学基金会;
关键词
Unsupervised Learning; Clustering; Artificial Neural Networks;
D O I
10.1109/ICDM.2018.00138
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new algorithm, Reinforced and Informed Network-based Clustering (RINC), for finding unknown groups of similar data objects in sparse and largely non-overlapping feature space where a network structure among features can be observed. Sparse and non-overlapping unlabeled data become increasingly common and available especially in text mining and biomedical data mining. RINC inserts a domain informed model into a modelless neural network. In particular, our approach integrates physically meaningful feature dependencies into the neural network architecture and soft computational constraint. Our learning algorithm efficiently clusters sparse data through integrated smoothing and sparse auto-encoder learning. The informed design requires fewer samples for training and at least part of the model becomes explainable. The architecture of the reinforced network layers smooths sparse data over the network dependency in the feature space. Most importantly, through back-propagation, the weights of the reinforced smoothing layers are simultaneously constrained by the remaining sparse auto-encoder layers that set the target values to be equal to the raw inputs. Empirical results demonstrate that RINC achieves improved accuracy and renders physically meaningful clustering results.
引用
收藏
页码:1079 / 1084
页数:6
相关论文
共 50 条
  • [21] A New Hierarchical Clustering Algorithm to Identify Non-overlapping Like-minded Communities
    Deepak, Talasila Sai
    Adhya, Hindol
    Kejriwal, Shyamal
    Gullapalli, Bhanuteja
    Shannigrahi, Saswata
    PROCEEDINGS OF THE 27TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT'16), 2016, : 319 - 321
  • [22] Decomposing Subspaces of EEG Channel Space into Potentials of Non-overlapping Distributed Sources
    Nolte, Guido
    Sosa, Pedro Valdes
    WORLD CONGRESS ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING, VOL 25, PT 2 - DIAGNOSTIC IMAGING, 2009, 25 : 749 - 752
  • [23] Parallel incomplete Cholesky preconditioners based on the non-overlapping data distribution
    Haase, G
    PARALLEL COMPUTING, 1998, 24 (11) : 1685 - 1703
  • [24] I/O-efficient data structures for non-overlapping indexing
    Hooshmand, Sahar
    Abedin, Paniz
    Kulekci, M. Oguzhan
    Thankachan, Sharma V.
    THEORETICAL COMPUTER SCIENCE, 2021, 857 : 1 - 7
  • [25] Feature Masking on Non-Overlapping Regions for Detecting Dense Cells in Blood Smear Image
    Wu, Huisi
    Lin, Canfeng
    Liu, Jiasheng
    Song, Youyi
    Wen, Zhenkun
    Qin, Jing
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (06) : 1668 - 1680
  • [26] Vertical federated learning-based feature selection with non-overlapping sample utilization
    Feng, Siwei
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 208
  • [27] An efficient algorithm to detect common ancestor genes for non-overlapping inversion and applications
    Zohora, Fatema Tuz
    Rahman, M. Sohel
    THEORETICAL COMPUTER SCIENCE, 2016, 656 : 188 - 214
  • [28] Estimating Treatment Effects Using Observational Data and Experimental Data with Non-Overlapping Support
    Han, Kevin
    Wu, Han
    Wu, Linjia
    Shi, Yu
    Liu, Canyao
    ECONOMETRICS, 2024, 12 (03)
  • [29] Atoms in molecules as non-overlapping, bounded, space-filling open quantum systems
    Bader, Richard F. W.
    Matta, Cherif F.
    FOUNDATIONS OF CHEMISTRY, 2013, 15 (03) : 253 - 276
  • [30] Atoms in molecules as non-overlapping, bounded, space-filling open quantum systems
    Richard F. W. Bader
    Chérif F. Matta
    Foundations of Chemistry, 2013, 15 : 253 - 276