Rethink the Top-u Attention in Sparse Self-attention for Long Sequence Time-Series Forecasting

被引:0
|
作者
Meng, Xiangxu [1 ]
Li, Wei [1 ,2 ]
Gaber, Tarek [3 ]
Zhao, Zheng [1 ]
Chen, Chuhao [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Modeling & Emulat E Govt Natl Engn Lab, Harbin 150001, Peoples R China
[3] Univ Salford, Sch Sci Engn & Environm, Manchester, England
基金
中国国家自然科学基金;
关键词
Time-series; Top-u Attention; Long-tailed distribution; Sparse self-attention;
D O I
10.1007/978-3-031-44223-0_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long time-series forecasting plays a crucial role in production and daily life, covering various areas such as electric power loads, stock trends and road traffic. Attention-based models have achieved significant performance advantages based on the long-term modelling capabilities of self-attention. However, regarding the criticized quadratic time complexity of the self-attention mechanism, most subsequent work has attempted to improve on it from the perspective of the sparse distribution of attention. In the main line of these works, we further investigate the position distribution of Top-u attention in the long-tail distribution of sparse attention and propose a two-stage self-attention mechanism, named ProphetAttention. Specifically, in the training phase, ProphetAttention memorizes the position of Top-u attention, and in the prediction phase, it uses the recorded position indices of Top-u attention to directly obtain Top-u attention for sparse attention computation, thereby avoiding the redundant computation of measuring Top-u attention. Results on four widely used real-world datasets demonstrate that ProphetAttention improves the prediction efficiency of long sequence time-series compared to the Informer model by approximately 17%-26% across all prediction horizons and significantly promotes prediction speed.
引用
收藏
页码:256 / 267
页数:12
相关论文
共 50 条
  • [31] Long sequence time-series forecasting with deep learning: A survey
    Chen, Zonglei
    Ma, Minbo
    Li, Tianrui
    Wang, Hongjun
    Li, Chongshou
    INFORMATION FUSION, 2023, 97
  • [32] Expanding the prediction capacity in long sequence time-series forecasting
    Zhou, Haoyi
    Li, Jianxin
    Zhang, Shanghang
    Zhang, Shuai
    Yan, Mengyi
    Xiong, Hui
    ARTIFICIAL INTELLIGENCE, 2023, 318
  • [33] Multivariate long-time series traffic passenger flow prediction using causal convolutional sparse self-attention MTS-Informer
    Liu, Miaonan
    Wang, Wei
    Hu, Xianhui
    Fu, Yunlai
    Xu, Fujin
    Miao, Xinying
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (34): : 24207 - 24223
  • [34] A non-parametric softmax for improving neural attention in time-series forecasting
    Totaro, Simone
    Hussain, Amir
    Scardapane, Simone
    NEUROCOMPUTING, 2020, 381 : 177 - 185
  • [35] StackDA: A Stacked Dual Attention Neural Network for Multivariate Time-Series Forecasting
    Hong, Jungsoo
    Park, Jinuk
    Park, Sanghyun
    IEEE ACCESS, 2021, 9 : 145955 - 145967
  • [36] Time-Series Neural Network: A High-Accuracy Time-Series Forecasting Method Based on Kernel Filter and Time Attention
    Zhang, Lexin
    Wang, Ruihan
    Li, Zhuoyuan
    Li, Jiaxun
    Ge, Yichen
    Wa, Shiyun
    Huang, Sirui
    Lv, Chunli
    INFORMATION, 2023, 14 (09)
  • [37] Replacing self-attentions with convolutional layers in multivariate long sequence time-series forecasting
    Wang, Yong
    Peng, Jianjian
    Wang, Xiaohu
    Zhang, Zhicheng
    Duan, Junting
    APPLIED INTELLIGENCE, 2024, 54 (01) : 522 - 543
  • [38] Replacing self-attentions with convolutional layers in multivariate long sequence time-series forecasting
    Yong Wang
    Jianjian Peng
    Xiaohu Wang
    Zhicheng Zhang
    Junting Duan
    Applied Intelligence, 2024, 54 : 522 - 543
  • [39] ALAE: self-attention reconstruction network for multivariate time series anomaly identification
    Jiang, Kai
    Liu, Hui
    Ruan, Huaijun
    Zhao, Jia
    Lin, Yuxiu
    SOFT COMPUTING, 2023, 27 (15) : 10509 - 10519
  • [40] Hierarchical multihead self-attention for time-series-based fault diagnosis
    Chengtian Wang
    Hongbo Shi
    Bing Song
    Yang Tao
    ChineseJournalofChemicalEngineering, 2024, 70 (06) : 104 - 117