Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

被引：0

作者：

Gu, Albert ^{[1
]}

Johnson, Isys ^{[3
]}

Goel, Karan ^{[1
]}

Saab, Khaled ^{[2
]}

Dao, Tri ^{[1
]}

Rudra, Atri ^{[3
]}

Re, Christopher ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA

[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

[3] SUNY Buffalo, Univ Buffalo, Dept Comp Sci & Engn, Buffalo, NY USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency. We introduce a simple sequence model inspired by control systems that generalizes these approaches while addressing their shortcomings. The Linear State-Space Layer (LSSL) maps a sequence u bar right arrow y by simply simulating a linear continuous-time state-space representation <(x)over dot> = Ax + Bu, y = Cx + Du. Theoretically, we show that LSSL models are closely related to the three aforementioned families of models and inherit their strengths. For example, they generalize convolutions to continuous-time, explain common RNN heuristics, and share features of NDEs such as time-scale adaptation. We then incorporate and generalize recent theory on continuous-time memorization to introduce a trainable subset of structured matrices A that endow LSSLs with long-range memory. Empirically, stacking LSSL layers into a simple deep neural network obtains state-of-the-art results across time series benchmarks for long dependencies in sequential image classification, real-world healthcare regression tasks, and speech. On a difficult speech classification task with length-16000 sequences, LSSL outperforms prior approaches by 24 accuracy points, and even outperforms baselines that use hand-crafted features on 100x shorter sequences.

引用

页数：14

共 50 条

[1] Cointegrated continuous-time linear state-space and MCARMA models
Fasen-Hartmann, Vicky
Scholz, Markus
[J]. STOCHASTICS-AN INTERNATIONAL JOURNAL OF PROBABILITY AND STOCHASTIC PROCESSES, 2020, 92 (07) : 1064 - 1099
[2] Direct identification of continuous-time linear switched state-space models
Mejari, Manas
Piga, Dario
[J]. IFAC PAPERSONLINE, 2023, 56 (02): : 4210 - 4215
[3] Robust FIR Filters for Linear Continuous-Time State-Space Models With Uncertainties
Quan, Zhonghua
Han, Soohee
Park, Jung Hun
Kwon, Wook Hyun
[J]. IEEE SIGNAL PROCESSING LETTERS, 2008, 15 : 621 - 624
[4] Adaptive identification of continuous-time MIMO state-space models
Afri, Chouaib
Bako, Laurent
Andrieu, Vincent
Dufour, Pascal
[J]. 2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 5677 - 5682
[5] H∞ FIR filters for linear continuous-time state-space systems
School of Electrical Engineering and Computer Science, Seoul National University, Seoul 151-742, Korea, Republic of
[J]. IEEE Signal Process Lett, 2006, 9 (557-560):
[6] H∞ FIR filters for linear continuous-time state-space systems
Ahn, Choon Ki
Han, Soohee
Kwon, Wook Hyun
[J]. IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (09) : 557 - 560
[7] An integral architecture for identification of continuous-time state-space LPV models
Mejari, Manas
Mavkov, Bojan
Forgione, Marco
Piga, Dario
[J]. IFAC PAPERSONLINE, 2021, 54 (08): : 7 - 12
[8] DISCRETE-TIME BILINEAR REPRESENTATION OF CONTINUOUS-TIME BILINEAR STATE-SPACE MODELS
Phan, Minh Q.
Shi, Yunde
Betti, Raimondo
Longman, Richard W.
[J]. SPACEFLIGHT MECHANICS 2012, 2012, 143 : 571 - +
[9] ON HOMOGENEOUS MARKOV-MODELS WITH CONTINUOUS-TIME AND FINITE OR COUNTABLE STATE-SPACE
YUSHKEVIC, AA
FAINBERG, EA
[J]. THEORY OF PROBABILITY AND ITS APPLICATIONS, 1979, 24 (01) : 156 - 161
[10] Maximum approximate likelihood estimation of general continuous-time state-space models
Mews, Sina
Langrock, Roland
Oetting, Marius
Yaqine, Houda
Reinecke, Jost
[J]. STATISTICAL MODELLING, 2024, 24 (01) : 9 - 28

← 1 2 3 4 5 →