Exploiting Low-Rank Tensor-Train Deep Neural Networks Based on Riemannian Gradient Descent With Illustrations of Speech Processing

被引:8
|
作者
Qi, Jun [1 ,2 ]
Yang, Chao-Han Huck [2 ]
Chen, Pin-Yu [3 ]
Tejedor, Javier [4 ]
机构
[1] Fudan Univ, Sch Informat Sci & Engn, Dept Elect Engn, Shanghai 200438, Peoples R China
[2] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
[3] IBM Res, Yorktown Height, NY 10598 USA
[4] CEU Univ, Univ San Pablo CEU, Inst Technol, Boadilla Del Monte 28668, Spain
关键词
Tensor-train network; speech enhancement; spoken command recognition; Riemannian gradient descent; low-rank tensor-train decomposition; tensor-train deep neural network; MEAN ABSOLUTE ERROR; ALGORITHMS; RMSE; MAE;
D O I
10.1109/TASLP.2022.3231714
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work focuses on designing low-complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convolutional neural network (CNN), which is denoted as CNN+(LR-TT-DNN), is set up to boost the performance. Instead of randomly assigning large TT-ranks for TT-DNN, we leverage Riemannian gradient descent to determine a TT-DNN associated with small TT-ranks. Furthermore, CNN+(LR-TT-DNN) consists of convolutional layers at the bottom for feature extraction and several TT layers at the top to solve regression and classification problems. We separately assess the LR-TT-DNN and CNN+(LR-TT-DNN) models on speech enhancement and spoken command recognition tasks. Our empirical evidence demonstrates that the LR-TT-DNN and CNN+(LR-TT-DNN) models with fewer model parameters can outperform the TT-DNN and CNN+(TT-DNN) counterparts.
引用
收藏
页码:633 / 642
页数:10
相关论文
共 50 条
  • [21] Low-Rank Tensor Completion Using Matrix Factorization Based on Tensor Train Rank and Total Variation
    Meng Ding
    Ting-Zhu Huang
    Teng-Yu Ji
    Xi-Le Zhao
    Jing-Hua Yang
    Journal of Scientific Computing, 2019, 81 : 941 - 964
  • [22] Designing Tensor-Train Deep Neural Networks For Time-Varying MIMO Channel Estimation
    Zhang, Jing
    Ma, Xiaoli
    Qi, Jun
    Jin, Shi
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2021, 15 (03) : 759 - 773
  • [23] Deep Neural Network Based Monaural Speech Enhancement with Sparse and Low-Rank Decomposition
    Shi, Wenhua
    Zhang, Xiongwei
    Sun, Meng
    Zou, Xia
    Wei, Yanmin
    Min, Gang
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1644 - 1647
  • [24] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
    Shi, Wenhua
    Zhang, Xiongwei
    Zou, Xia
    Sun, Meng
    Han, Wei
    Li, Li
    Min, Gang
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
  • [25] LOW-RANK PLUS DIAGONAL ADAPTATION FOR DEEP NEURAL NETWORKS
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5005 - 5009
  • [26] Sequence Discriminative Training for Low-Rank Deep Neural Networks
    Tachioka, Yuuki
    Watanabe, Shinji
    Le Roux, Jonathan
    Hershey, John R.
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 572 - 576
  • [27] Automated Synthesis of Low-rank Control Systems from sc-LTL Specifications using Tensor-Train Decompositions
    Alora, John Irvin
    Gorodetsky, Alex
    Karaman, Sertac
    Marzouk, Youssef
    Lowry, Nathan
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 1131 - 1138
  • [28] Deep compression of convolutional neural networks with low-rank approximation
    Astrid, Marcella
    Lee, Seung-Ik
    ETRI JOURNAL, 2018, 40 (04) : 421 - 434
  • [29] Low-rank tensor embedded deep neural network for hyperspectral image denoising
    Tu K.
    Xiong F.
    Hou X.
    National Remote Sensing Bulletin, 2024, 28 (01) : 121 - 131
  • [30] MULTI-LINGUAL SPEECH RECOGNITION WITH LOW-RANK MULTI-TASK DEEP NEURAL NETWORKS
    Mohan, Aanchan
    Rose, Richard
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4994 - 4998