LIGHTSPEECH: LIGHTWEIGHT AND FAST TEXT TO SPEECH WITH NEURAL ARCHITECTURE SEARCH

被引：19

作者：

Luo, Renqian ^{[1
]}

Tan, Xu ^{[2
]}

Wang, Rui ^{[2
]}

Qin, Tao ^{[3
]}

Li, Jinzhu ^{[3
]}

Zhao, Sheng ^{[3
]}

Chen, Enhong ^{[1
]}

Liu, Tie-Yan ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei, Peoples R China

[2] Microsoft Res Asia, Beijing, Peoples R China

[3] Microsoft Azure Speech, Beijing, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Text to speech; lightweight; fast; neural architecture search;

D O I：

10.1109/ICASSP39728.2021.9414403

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Text to speech (TTS) has been broadly used to synthesize natural and intelligible speech in different scenarios. Deploying TTS in various end devices such as mobile phones or embedded devices requires extremely small memory usage and inference latency. While non-autoregressive TTS models such as FastSpeech have achieved significantly faster inference speed than autoregressive models, their model size and inference latency are still large for the deployment in resource constrained devices. In this paper, we propose LightSpeech, which leverages neural architecture search (NAS) to automatically design more lightweight and efficient models based on FastSpeech. We first profile the components of current FastSpeech model and carefully design a novel search space containing various lightweight and potentially effective architectures. Then NAS is utilized to automatically discover well performing architectures within the search space. Experiments show that the model discovered by our method achieves 15x model compression ratio and 6.5x inference speedup on CPU with on par voice quality. Audio demos are provided at https://speechresearch.github.io/lightspeech.

引用

下载

页码：5699 / 5703

页数：5

共 50 条

[21] DONNAv2-Lightweight Neural Architecture Search for Vision tasks
Priyadarshi, Sweta
Jiang, Tianyu
Cheng, Hsin-Pai
Krishna, Sendil
Ganapathy, Viswanath
Patel, Chirag
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1376 - 1384
[22] FastTalker: A neural text-to-speech architecture with shallow and group autoregression
Liu, Rui
Sisman, Berrak
Lin, Yixing
Li, Haizhou
NEURAL NETWORKS, 2021, 141 : 306 - 314
[23] Proxy Data Generation for Fast and Efficient Neural Architecture Search
Minje Park
Journal of Electrical Engineering & Technology, 2023, 18 : 2307 - 2316
[24] SqueezeNAS: Fast Neural Architecture Search for Faster Semantic Segmentation
Shaw, Albert
Hunter, Daniel
Iandola, Forrest
Sidhu, Sammy
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2014 - 2024
[25] Proxy Data Generation for Fast and Efficient Neural Architecture Search
Park, Minje
JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2023, 18 (03) : 2307 - 2316
[26] FP-NAS: Fast Probabilistic Neural Architecture Search
Yan, Zhicheng
Dai, Xiaoliang
Zhang, Peizhao
Tian, Yuandong
Wu, Bichen
Feiszli, Matt
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15134 - 15143
[27] Leveraging Tensor Methods in Neural Architecture Search for the automatic development of lightweight Convolutional Neural Networks
Dhanaraj, Mayur
Do, Huyen
Nair, Dinesh
Xu, Cong
BIG DATA IV: LEARNING, ANALYTICS, AND APPLICATIONS, 2022, 12097
[28] Lightweight Multiscale Neural Architecture Search With SpectralSpatial Attention for Hyperspectral Image Classification
Cao, Chunhong
Xiang, Han
Song, Wei
Yi, Hongbo
Xiao, Fen
Gao, Xieping
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[29] Differentiable Neural Architecture Search for Extremely Lightweight Image Super-Resolution
Huang, Han
Shen, Li
He, Chaoyang
Dong, Weisheng
Liu, Wei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (06) : 2672 - 2682
[30] Searching by Topological Complexity: Lightweight Neural Architecture Search for Coal and Gangue Classification
Zhu, Wenbo
Hu, Yongcong
Zhu, Zhengjun
Yeh, Wei-Chang
Li, Haibing
Zhang, Zhongbo
Fu, Weijie
MATHEMATICS, 2024, 12 (05)

← 1 2 3 4 5 →