Iterative Compression of End-to-End ASR Model using AutoML

被引:3
|
作者
Mehrotra, Abhinav [1 ]
Dudziak, Lukasz [1 ]
Yeo, Jinsu [2 ]
Lee, Young-yoon [2 ]
Vipperla, Ravichander [1 ]
Abdelfattah, Mohamed S. [1 ]
Bhattacharya, Sourav [1 ]
Ishtiaq, Samin [1 ]
Ramos, Alberto Gil C. P. [1 ]
Lee, SangJeong [2 ]
Kim, Daehyun [2 ]
Lane, Nicholas D. [1 ,3 ]
机构
[1] Samsung AI Ctr, Cambridge, England
[2] Samsung Res, On Device Lab, Seoul, South Korea
[3] Univ Cambridge, Cambridge, England
来源
关键词
ASR Compression; AutoML; Reinforcement Learning;
D O I
10.21437/Interspeech.2020-1894
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques. Past research have shown that AutoML-based Low Rank Factorization (LRF) technique, when applied to an end-to-end Encoder-Attention-Decoder style ASR model, can achieve a speedup of up to 3.7x, outperforming laborious manual rank-selection approaches. However, we show that current AutoML-based search techniques only work up to a certain compression level, beyond which they fail to produce compressed models with acceptable word error rates (WER). In this work, we propose an iterative AutoML-based LRF approach that achieves over 5x compression without degrading the WER, thereby advancing the state-of-the-art in ASR compression.
引用
收藏
页码:3361 / 3365
页数:5
相关论文
共 50 条
  • [1] ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning
    Dudziak, Lukasz
    Abdelfattah, Mohamed S.
    Vipperla, Ravichander
    Laskaridis, Stefanos
    Lane, Nicholas D.
    [J]. INTERSPEECH 2019, 2019, : 2235 - 2239
  • [2] TWO-PASS END-TO-END ASR MODEL COMPRESSION
    Dawalatabad, Nauman
    Vatsal, Tushar
    Gupta, Ashutosh
    Kim, Sungsoo
    Singh, Shatrughan
    Gowda, Dhananjaya
    Kim, Chanwoo
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 403 - 410
  • [3] UNSUPERVISED MODEL ADAPTATION FOR END-TO-END ASR
    Sivaraman, Ganesh
    Casal, Ricardo
    Garland, Matt
    Khoury, Elie
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6987 - 6991
  • [4] A BETTER AND FASTER END-TO-END MODEL FOR STREAMING ASR
    Li, Bo
    Gulati, Anmol
    Yu, Jiahui
    Sainath, Tara N.
    Chiu, Chung-Cheng
    Narayanan, Arun
    Chang, Shuo-Yiin
    Pang, Ruoming
    He, Yanzhang
    Qin, James
    Han, Wei
    Liang, Qiao
    Zhang, Yu
    Strohman, Trevor
    Wu, Yonghui
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5634 - 5638
  • [5] DOES SPEECH ENHANCEMENTWORK WITH END-TO-END ASR OBJECTIVES?: EXPERIMENTAL ANALYSIS OF MULTICHANNEL END-TO-END ASR
    Ochiai, Tsubasa
    Watanabe, Shinji
    Katagiri, Shigeru
    [J]. 2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [6] Data Augmentation Using CycleGAN for End-to-End Children ASR
    Singh, Dipesh K.
    Amin, Preet P.
    Sailor, Hardik B.
    Patil, Hemant A.
    [J]. 29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 511 - 515
  • [7] STREAMING BILINGUAL END-TO-END ASR MODEL USING ATTENTION OVER MULTIPLE SOFTMAX
    Patil, Aditya
    Joshi, Vikas
    Agrawal, Purvi
    Mehta, Rupesh
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 252 - 259
  • [8] Towards Lifelong Learning of End-to-end ASR
    Chang, Heng-Jui
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. INTERSPEECH 2021, 2021, : 2551 - 2555
  • [9] Contextual Biasing for End-to-End Chinese ASR
    Zhang, Kai
    Zhang, Qiuxia
    Wang, Chung-Che
    Jang, Jyh-Shing Roger
    [J]. IEEE ACCESS, 2024, 12 : 92960 - 92975
  • [10] Phonemic competition in end-to-end ASR models
    ten Bosch, Louis
    Bentum, Martijn
    Boves, Lou
    [J]. INTERSPEECH 2023, 2023, : 586 - 590