SINGULAR VALUE DECOMPOSITION BASED LOW-FOOTPRINT SPEAKER ADAPTATION AND PERSONALIZATION FOR DEEP NEURAL NETWORK

被引：0

作者：

Xue, Jian ^{[1
]}

Li, Jinyu ^{[1
]}

Yu, Dong ^{[1
]}

Seltzer, Mike ^{[1
]}

Gong, Yifan ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年

关键词：

deep neural network; speaker adaptation; speaker personalization; singular value decomposition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The large number of parameters in deep neural networks (DNN) for automatic speech recognition (ASR) makes speaker adaptation very challenging. It also limits the use of speaker personalization due to the huge storage cost in large-scale deployments. In this paper we address DNN adaptation and personalization issues by presenting two methods based on the singular value decomposition (SVD). The first method uses an SVD to replace the weight matrix of a speaker independent DNN by the product of two low rank matrices. Adaptation is then performed by updating a square matrix inserted between the two low-rank matrices. In the second method, we adapt the full weight matrix but only store the delta matrix - the difference between the original and adapted weight matrices. We decrease the footprint of the adapted model by storing a reduced rank version of the delta matrix via an SVD. The proposed methods were evaluated on short message dictation task. Experimental results show that we can obtain similar accuracy improvements as the previously proposed Kullback-Leibler divergence (KLD) regularized method with far fewer parameters, which only requires 0.89% of the original model storage.

引用

页数：5

共 50 条

[1] UNSUPERVISED SPEAKER ADAPTATION OF DEEP NEURAL NETWORK BASED ON THE COMBINATION OF SPEAKER CODES AND SINGULAR VALUE DECOMPOSITION FOR SPEECH RECOGNITION
Xue, Shaofei
Jiang, Hui
Dai, Lirong
Liu, Qingfeng
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4555 - 4559
[2] NEURAL NETWORK FOR SINGULAR VALUE DECOMPOSITION
CICHOCKI, A
[J]. ELECTRONICS LETTERS, 1992, 28 (08) : 784 - 786
[3] Restructuring of Deep Neural Network Acoustic Models with Singular Value Decomposition
Xue, Jian
Li, Jinyu
Gong, Yifan
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2364 - 2368
[4] Neural network for text classification based on singular value decomposition
Li, Cheng Hua
Park, Soon Cheol
[J]. 2007 CIT: 7TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, : 47 - 52
[5] Interference Recognition Based on Singular Value Decomposition and Neural Network
Feng Man
Wang Zinan
[J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2020, 42 (11) : 2573 - 2578
[6] Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation
Shen, Peng
Lu, Xugang
Kawai, Hisashi
[J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[7] Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition
Xue, Shaofei
Jiang, Hui
Dai, Lirong
[J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 1 - +
[8] Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition
Shaofei Xue
Hui Jiang
Lirong Dai
Qingfeng Liu
[J]. Journal of Signal Processing Systems, 2016, 82 : 175 - 185
[9] Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition
Xue, Shaofei
Jiang, Hui
Dai, Lirong
Liu, Qingfeng
[J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 175 - 185
[10] INVESTIGATING ONLINE LOW-FOOTPRINT SPEAKER ADAPTATION USING GENERALIZED LINEAR REGRESSION AND CLICK-THROUGH DATA
Zhao, Yong
Li, Jinyu
Xue, Jian
Gong, Yifan
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4310 - 4314

← 1 2 3 4 5 →