A Fast Semi-Supervised Clustering Framework for Large-Scale Time Series Data

被引:17
|
作者
He, Guoliang [1 ]
Pan, Yanzhou [2 ]
Xia, Xuewen [3 ]
He, Jinrong [4 ]
Peng, Rong [1 ]
Xiong, Neal N. [5 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[2] Rice Univ, Engn Dept, Houston, TX 77005 USA
[3] Minnan Normal Univ, Coll Phys & Informat Engn, Zhangzhou 363000, Peoples R China
[4] Yanan Univ, Coll Math & Comp Sci, Yanan 716000, Peoples R China
[5] Northeastern State Univ, Dept Math & Comp Sci, Tahlequah, OK 74464 USA
基金
中国国家自然科学基金;
关键词
Time series analysis; Clustering algorithms; Time measurement; Velocity measurement; Shape measurement; Clustering methods; Contracts; Constraint propagation; semi-supervised learning; similarity measure; time series clustering; CLASSIFICATION;
D O I
10.1109/TSMC.2019.2931731
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Semi-supervised clustering algorithms have several limitations: 1) the computation complexity of them is very high, because calculating the similarity distances of pairs of examples is time-consuming; 2) traditional semi-supervised clustering methods have not considered how to make full use of must-link and cannot-link constraints. In the clustering, the contribution of a few pairwise constraints to the clustering performance is very limited, and some may negatively affect the outcome; and 3) these methods are not effective to handle high dimensional data, especially for time series data. Up to now, few work touched semi-supervised clustering on time series data. To efficiently cluster large-scale time series data, we first tackle contract time series clustering to produce the most accurate clustering results under a contracted time. We propose a semi-supervised time series clustering framework (STSC), which integrates a fast similarity measure and a constraint propagation approach. Based on the proposed framework, two valid semi-supervised clustering algorithms including fssK-means and fssDBSCAN are designed. Experiments on 11 datasets show that our proposed method is efficient and effective for clustering large-scale time series data.
引用
收藏
页码:4201 / 4216
页数:16
相关论文
共 50 条
  • [1] Nonnegative Spectral Clustering for Large-Scale Semi-supervised Learning
    Hu, Weibo
    Chen, Chuan
    Ye, Fanghua
    Zheng, Zibin
    Ling, Guohui
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 287 - 291
  • [2] YADING: Fast Clustering of Large-Scale Time Series Data
    Ding, Rui
    Wang, Qiang
    Dang, Yingnong
    Fu, Qiang
    Zhang, Haidong
    Zhang, Dongmei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (05): : 473 - 484
  • [3] Semi-Supervised Hashing for Large-Scale Search
    Wang, Jun
    Kumar, Sanjiv
    Chang, Shih-Fu
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (12) : 2393 - 2406
  • [4] Accelerated low-rank representation for subspace clustering and semi-supervised classification on large-scale data
    Fan, Jicong
    Tian, Zhaoyang
    Zhao, Mingbo
    Chow, Tommy W. S.
    [J]. NEURAL NETWORKS, 2018, 100 : 39 - 48
  • [5] Semi-supervised multi-view binary learning for large-scale image clustering
    Liu, Mingyang
    Yang, Zuyuan
    Han, Wei
    Chen, Junhang
    Sun, Weijun
    [J]. APPLIED INTELLIGENCE, 2022, 52 (13) : 14853 - 14870
  • [6] Semi-supervised multi-view binary learning for large-scale image clustering
    Mingyang Liu
    Zuyuan Yang
    Wei Han
    Junhang Chen
    Weijun Sun
    [J]. Applied Intelligence, 2022, 52 : 14853 - 14870
  • [7] Fast semi-supervised evidential clustering
    Antoine, Violaine
    Guerrero, Jose A.
    Xie, Jiarui
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 133 (133) : 116 - 132
  • [8] Semi-supervised incremental feature extraction algorithm for large-scale data stream
    Tan, Chao
    Ji, Genlin
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (06):
  • [9] Semi-supervised clustering of large data sets with kernel methods
    Fausser, Stefan
    Schwenker, Friedhelm
    [J]. PATTERN RECOGNITION LETTERS, 2014, 37 : 78 - 84
  • [10] A Semi-supervised Clustering for Incomplete Data
    Goel, Sonia
    Tushir, Meena
    [J]. APPLICATIONS OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN ENGINEERING, SIGMA 2018, VOL 1, 2019, 698 : 323 - 331