DTCM: Deep Transformer Capsule Mutual Distillation for Multivariate Time Series Classification

被引:11
|
作者
Xiao, Zhiwen [1 ,2 ,3 ]
Xu, Xin [4 ]
Xing, Huanlai [1 ,2 ,3 ]
Zhao, Bowen [1 ,2 ,3 ]
Wang, Xinhan [1 ,2 ,3 ]
Song, Fuhong [5 ]
Qu, Rong [6 ]
Feng, Li [1 ,2 ,3 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu, Peoples R China
[2] Southwest Jiaotong Univ, Tangshan Inst, Tangshan 063000, Peoples R China
[3] Minist Educ, Engn Res Ctr Sustainable Urban Intelligent Transpo, Chengdu 611756, Peoples R China
[4] China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221166, Peoples R China
[5] Guizhou Univ Finance & Econ, Sch Informat, Guiyang 550025, Peoples R China
[6] Univ Nottingham, Sch Comp Sci, Nottingham NG7 2RD, England
基金
中国国家自然科学基金;
关键词
Feature extraction; Classification algorithms; Time series analysis; Data mining; Transformers; Routing; Knowledge transfer; Capsule network; data mining; deep learning; knowledge distillation (KD); multivariate time series classification (MTSC); mutual learning; REPRESENTATION; NETWORK;
D O I
10.1109/TCDS.2024.3370219
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article proposes a dual-network-based feature extractor, perceptive capsule network (PCapN), for multivariate time series classification (MTSC), including a local feature network (LFN) and a global relation network (GRN). The LFN has two heads (i.e., Head_A and Head_B), each containing two squash convolutional neural network (CNN) blocks and one dynamic routing block to extract the local features from the data and mine the connections among them. The GRN consists of two capsule-based transformer blocks and one dynamic routing block to capture the global patterns of each variable and correlate the useful information of multiple variables. Unfortunately, it is difficult to directly deploy PCapN on mobile devices due to its strict requirement for computing resources. So, this article designs a lightweight capsule network (LCapN) to mimic the cumbersome PCapN. To promote knowledge transfer from PCapN to LCapN, this article proposes a deep transformer capsule mutual (DTCM) distillation method. It is targeted and offline, using one- and two-way operations to supervise the knowledge distillation (KD) process for the dual-network-based student and teacher models. Experimental results show that the proposed PCapN and DTCM achieve excellent performance on University of East Anglia 2018 (UEA2018) datasets regarding top-1 accuracy.
引用
收藏
页码:1445 / 1461
页数:17
相关论文
共 50 条
  • [1] Mgformer: Multi-group transformer for multivariate time series classification
    Wen, Jianfeng
    Zhang, Nan
    Lu, Xuzhe
    Hu, Zhongyi
    Huang, Hui
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [2] An Ensemble of Transformer and LSTM Approach for Multivariate Time Series Data Classification
    Narayan, Aryan
    Mishra, Bodhi Satwa
    Hiremath, P. G. Sunitha
    Pendari, Neha Tarannum
    Gangisetty, Shankar
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5774 - 5779
  • [3] Mutual information based feature subset selection in multivariate time series classification
    Ircio, Josu
    Lojo, Aizea
    Mori, Usue
    Lozano, Jose A.
    [J]. PATTERN RECOGNITION, 2020, 108 (108)
  • [4] From anomaly detection to classification with graph attention and transformer for multivariate time series
    Wang, Chaoyang
    Liu, Guangyu
    [J]. ADVANCED ENGINEERING INFORMATICS, 2024, 60
  • [5] TransDBC: Transformer for Multivariate Time-Series based Driver Behavior Classification
    Vyas, Jayant
    Bhardwaj, Nishit
    Bhumika
    Das, Debasis
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] Dyformer: A dynamic transformer-based architecture for multivariate time series classification
    Yang, Chao
    Wang, Xianzhi
    Yao, Lina
    Long, Guodong
    Xu, Guandong
    [J]. INFORMATION SCIENCES, 2024, 656
  • [7] An Aggregated Convolutional Transformer Based on Slices and Channels for Multivariate Time Series Classification
    Wu, Yupeng
    Lian, Cheng
    Zeng, Zhigang
    Xu, Bingrong
    Su, Yixin
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (03): : 768 - 779
  • [8] TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data
    Tuli, Shreshth
    Casale, Giuliano
    Jennings, Nicholas R.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (06): : 1201 - 1214
  • [9] Time-frequency deep metric learning for multivariate time series classification
    Chen, Zhi
    Liu, Yongguo
    Zhu, Jiajing
    Zhang, Yun
    Jin, Rongjiang
    He, Xia
    Tao, Jing
    Chen, Lidian
    [J]. NEUROCOMPUTING, 2021, 462 : 221 - 237
  • [10] A Cycle Deep Belief Network Model for Multivariate Time Series Classification
    Wang, Shuqin
    Hua, Gang
    Hao, Guosheng
    Xie, Chunli
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017