Speaker-Phrase-Specific Adaptation of PLDA Model for Improved Performance in Text-Dependent Speaker Verification

被引：0

作者：

Mohammad Azharuddin Laskar

Chuya China Bhanja

Rabul Hussain Laskar

机构：

[1] National Institute of Technology Silchar,Department of Electronics and Communication Engineering

来源：

Circuits, Systems, and Signal Processing | 2021年 / 40卷

关键词：

Text-dependent speaker verification; PLDA adaptation; Online i-vector;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The i-vector/probabilistic linear discriminant analysis (PLDA) framework has been popularly used in the field of speaker verification for a long time. Lately, the introduction of online i-vectors and its integration with dynamic time warping template matching technique have significantly improved the performance of text-dependent speaker verification system. The PLDA model learns to discriminate among instances of different speaker-phrase classes and also compensates for channel and session variability. However, when exposed to unseen speakers and text, the variability compensation model turns less than optimal, leading to substantial verification error. In this paper, PLDA adaptation, in order to incorporate the idea of speaker-phrase-dependent variability in the ivector/PLDA technique, has been proposed. The adapted model gets specifically tuned to particular speaker-phrase class, leading to a more optimal solution. Two adaptation techniques, namely interpolation and weighted likelihood, have been explored in this work. Experiments have been performed on Part 1 of the RSR2015 database, and relative equal error rate (EER) reductions of up to 58.22% and 45% have been observed for interpolation and weighted likelihood techniques, respectively. The use of speaker-phrase-specific mean and whitening parameters has led to further improvement, resulting in EER reduction of up to 20% relative to that of the adapted models.

引用

页码：5127 / 5151

页数：24

共 50 条

[11] Bidirectional Attention for Text-Dependent Speaker Verification
Fang, Xin
Gao, Tian
Zou, Liang
Ling, Zhenhua
[J]. SENSORS, 2020, 20 (23) : 1 - 17
[12] Robust Methods for Text-Dependent Speaker Verification
Bhukya, Ramesh K.
Prasanna, S. R. Mahadeva
Sarma, Biswajit Dev
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (11) : 5253 - 5288
[13] Data Augmentation Enhanced Speaker Enrollment for Text-dependent Speaker Verification
Sarkar, Achintya Kumar
Sarma, Himangshu
Dwivedi, Priyanka
Tan, Zheng-Hua
[J]. 2020 3RD INTERNATIONAL CONFERENCE ON ENERGY, POWER AND ENVIRONMENT: TOWARDS CLEAN ENERGY TECHNOLOGIES (ICEPE 2020), 2021,
[14] IMPOSTURE CLASSIFICATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
Larcher, Anthony
Lee, Kong Aik
Ma, Bin
Li, Haizhou
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[15] Robust Methods for Text-Dependent Speaker Verification
Ramesh K. Bhukya
S. R. Mahadeva Prasanna
Biswajit Dev Sarma
[J]. Circuits, Systems, and Signal Processing, 2019, 38 : 5253 - 5288
[16] Content Normalization for Text-dependent Speaker Verification
Dey, Subhadeep
Madikeri, Srikanth
Motlicek, Petr
Ferras, Marc
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1482 - 1486
[17] Text-dependent speaker recognition using speaker specific compensation
Laxman, S
Sastry, PS
[J]. IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 384 - 387
[18] Regularized Within-Class Precision Matrix Based PLDA in Text-Dependent Speaker Verification
Yoon, Sung-Hyun
Jeon, Jong-June
Yu, Ha-Jin
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (18):
[19] PHONETICALLY-CONSTRAINED PLDA MODELING FOR TEXT-DEPENDENT SPEAKER VERIFICATION WITH MULTIPLE SHORT UTTERANCES
Larcher, Anthony
Lee, Kong Aik
Ma, Bin
Li, Haizhou
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7673 - 7677
[20] CHANNEL ADAPTATION OF PLDA FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
Chen, Liping
Lee, Kong Aik
Ma, Bin
Guo, Wu
Li, Haizhou
Dai, Li Rong
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5251 - 5255

← 1 2 3 4 5 →