Rethink the Top-u Attention in Sparse Self-attention for Long Sequence Time-Series Forecasting

被引:0
|
作者
Meng, Xiangxu [1 ]
Li, Wei [1 ,2 ]
Gaber, Tarek [3 ]
Zhao, Zheng [1 ]
Chen, Chuhao [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Engn Univ, Modeling & Emulat E Govt Natl Engn Lab, Harbin 150001, Peoples R China
[3] Univ Salford, Sch Sci Engn & Environm, Manchester, England
基金
中国国家自然科学基金;
关键词
Time-series; Top-u Attention; Long-tailed distribution; Sparse self-attention;
D O I
10.1007/978-3-031-44223-0_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long time-series forecasting plays a crucial role in production and daily life, covering various areas such as electric power loads, stock trends and road traffic. Attention-based models have achieved significant performance advantages based on the long-term modelling capabilities of self-attention. However, regarding the criticized quadratic time complexity of the self-attention mechanism, most subsequent work has attempted to improve on it from the perspective of the sparse distribution of attention. In the main line of these works, we further investigate the position distribution of Top-u attention in the long-tail distribution of sparse attention and propose a two-stage self-attention mechanism, named ProphetAttention. Specifically, in the training phase, ProphetAttention memorizes the position of Top-u attention, and in the prediction phase, it uses the recorded position indices of Top-u attention to directly obtain Top-u attention for sparse attention computation, thereby avoiding the redundant computation of measuring Top-u attention. Results on four widely used real-world datasets demonstrate that ProphetAttention improves the prediction efficiency of long sequence time-series compared to the Informer model by approximately 17%-26% across all prediction horizons and significantly promotes prediction speed.
引用
收藏
页码:256 / 267
页数:12
相关论文
共 50 条
  • [41] Multiscale echo self-attention memory network for multivariate time series classification
    Lyu, Huizi
    Huang, Desen
    Li, Sen
    Ma, Qianli
    Ng, Wing W. Y.
    NEUROCOMPUTING, 2023, 520 : 60 - 72
  • [42] Mixformer: An improved self-attention architecture applied to multivariate chaotic time series
    Fu, Ke
    Li, He
    Bai, Yan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [43] ALAE: self-attention reconstruction network for multivariate time series anomaly identification
    Kai Jiang
    Hui Liu
    Huaijun Ruan
    Jia Zhao
    Yuxiu Lin
    Soft Computing, 2023, 27 : 10509 - 10519
  • [44] Hierarchical multihead self-attention for time-series-based fault diagnosis
    Wang, Chengtian
    Shi, Hongbo
    Song, Bing
    Tao, Yang
    CHINESE JOURNAL OF CHEMICAL ENGINEERING, 2024, 70 : 104 - 117
  • [45] Time Series Anomaly Detection in Vehicle Sensors Using Self-Attention Mechanisms
    Zhang, Ze
    Yao, Yue
    Hutabarat, Windo
    Farnsworth, Michael
    Tiwari, Divya
    Tiwari, Ashutosh
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 15964 - 15976
  • [46] Research on Time Series Prediction via Quantum Self-Attention Neural Networks
    Chen X.
    Li C.
    Jin F.
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2024, 53 (01): : 110 - 118
  • [47] DFNet: Decomposition fusion model for long sequence time-series forecasting
    Zhang, Fan
    Guo, Tiantian
    Wang, Hua
    KNOWLEDGE-BASED SYSTEMS, 2023, 277
  • [48] Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
    Zhou, Haoyi
    Zhang, Shanghang
    Peng, Jieqi
    Zhang, Shuai
    Li, Jianxin
    Xiong, Hui
    Zhang, Wancai
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11106 - 11115
  • [49] SALSTM: segmented self-attention long short-term memory for long-term forecasting
    Dai, Zhi-Qiang
    Li, Jie
    Cao, Yang-Jie
    Zhang, Yong-Xiang
    Journal of Supercomputing, 2025, 81 (01):
  • [50] Position-Based Content Attention for Time Series Forecasting with Sequence-to-Sequence RNNs
    Cinar, Yagmur Gizem
    Mirisaee, Hamid
    Goswami, Parantapa
    Gaussier, Eric
    Ait-Bachir, Ali
    Strijov, Vadim
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 533 - 544