LIGHTSPEECH: LIGHTWEIGHT AND FAST TEXT TO SPEECH WITH NEURAL ARCHITECTURE SEARCH

被引:19
|
作者
Luo, Renqian [1 ]
Tan, Xu [2 ]
Wang, Rui [2 ]
Qin, Tao [3 ]
Li, Jinzhu [3 ]
Zhao, Sheng [3 ]
Chen, Enhong [1 ]
Liu, Tie-Yan [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Microsoft Azure Speech, Beijing, Peoples R China
关键词
Text to speech; lightweight; fast; neural architecture search;
D O I
10.1109/ICASSP39728.2021.9414403
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Text to speech (TTS) has been broadly used to synthesize natural and intelligible speech in different scenarios. Deploying TTS in various end devices such as mobile phones or embedded devices requires extremely small memory usage and inference latency. While non-autoregressive TTS models such as FastSpeech have achieved significantly faster inference speed than autoregressive models, their model size and inference latency are still large for the deployment in resource constrained devices. In this paper, we propose LightSpeech, which leverages neural architecture search (NAS) to automatically design more lightweight and efficient models based on FastSpeech. We first profile the components of current FastSpeech model and carefully design a novel search space containing various lightweight and potentially effective architectures. Then NAS is utilized to automatically discover well performing architectures within the search space. Experiments show that the model discovered by our method achieves 15x model compression ratio and 6.5x inference speedup on CPU with on par voice quality. Audio demos are provided at https://speechresearch.github.io/lightspeech.
引用
收藏
页码:5699 / 5703
页数:5
相关论文
共 50 条
  • [41] A Fast Evolutionary Knowledge Transfer Search for Multiscale Deep Neural Architecture
    Zhang R.
    Jiao L.
    Wang D.
    Liu F.
    Liu X.
    Yang S.
    IEEE Transactions on Neural Networks and Learning Systems, 2024, 35 (12) : 1 - 15
  • [42] Multi-Branch Neural Architecture Search for Lightweight Image Super-Resolution
    Ahn, Joon Young
    Cho, Nam Ik
    IEEE ACCESS, 2021, 9 : 153633 - 153646
  • [43] A lightweight model for distracted driver detection based on neural architecture search and coordinate attention
    Sun, Haibin
    Zhang, Mengting
    Computers and Electrical Engineering, 2025, 123
  • [44] Lightweight network learning with Zero-Shot Neural Architecture Search for UAV images
    Yao, Fengqin
    Wang, Shengke
    Ding, Laihui
    Zhong, Guoqiang
    Bullock, Leon Bevan
    Xu, Zhiwei
    Dong, Junyu
    KNOWLEDGE-BASED SYSTEMS, 2023, 260
  • [45] Contrastive Neural Architecture Search with Neural Architecture Comparators
    Chen, Yaofo
    Guo, Yong
    Chen, Qi
    Li, Minli
    Zeng, Wei
    Wang, Yaowei
    Tan, Mingkui
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9497 - 9506
  • [46] Memory-Efficient Models for Scene Text Recognition via Neural Architecture Search
    Hong, SeulGi
    Kim, DongHyun
    Choi, Min-Kook
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2020, : 183 - 191
  • [47] MULTI-RATE ATTENTION ARCHITECTURE FOR FAST STREAMABLE TEXT-TO-SPEECH SPECTRUM MODELING
    He, Qing
    Xiu, Zhiping
    Koehler, Thilo
    Wu, Jilong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5689 - 5693
  • [48] RelativeNAS: Relative Neural Architecture Search via Slow-Fast Learning
    Tan, Hao
    Cheng, Ran
    Huang, Shihua
    He, Cheng
    Qiu, Changxiao
    Yang, Fan
    Luo, Ping
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 475 - 489
  • [49] FOX-NAS: Fast, On-device and Explainable Neural Architecture Search
    Liu, Chia-Hsiang
    Han, Yu-Shin
    Sung, Yuan-Yao
    Lee, Yi
    Chiang, Hung-Yueh
    Wu, Kai-Chiang
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 789 - 797
  • [50] Evolutionary Algorithm Enhanced Neural Architecture Search for Text-Independent Speaker Verification
    Qu, Xiaoyang
    Wang, Jianzong
    Xiao, Jing
    INTERSPEECH 2020, 2020, : 961 - 965