A Segmental DNN/i-vector Approach for Digit-Prompted Speaker Verification

被引：0

作者：

Yan, Jie ^{[1
]}

Lei, Xie ^{[1
]}

Wang, Guangsen ^{[2
]}

Fu, Zhong-Hua ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Xian, Shaanxi, Peoples R China

[2] Tencent AI Lab, Shenzhen, Peoples R China

来源：

2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017) | 2017年

基金：

中国国家自然科学基金;

关键词：

SPEECH RECOGNITION;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

DNN/i-vectors have achieved state-of-the-art performance in text-independent speaker verification systems. For such systems, the UBM posteriors are replaced with the DNN posteriors when training the i-vector extractor to better model the phonetic space. However, the DNN/i-vector systems have limited success on text-dependent speaker verification systems as the lexical variabilities, which are important for such applications, are suppressed in the utterance-level i-vectors. In this paper, we propose a segmental DNN/i-vector approach for the digit-prompted speaker verification task. Specifically, we segment the utterance into digits and model each digit using an individual DNN/i-vector system. By modeling the variability for each digit independently, we can focus more on the speaker characteristics for each digit. To take into consideration the uncertainties in the DNN posteriors, we propose a confidence measure based weighting method. On the RSR2015 dataset, the proposed approach yields an equal error rate of 3.44%, compared to 5.76% of the baseline utterance-level DNN/i-vector system and 4.54% of the joint factor analysis (JFA) system.

引用

页码：1 / 5

页数：5

共 50 条

[1] Digit-dependent Local I-Vector for Text-Prompted Speaker Verification with Random Digit Sequences
Chen, Peixin
Guo, Wu
Hue, Guoping
[J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[2] Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification
Liu, Yi
He, Liang
Zhang, Wei-Qiang
Liu, Jia
Johnson, Michael T.
[J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1467 - 1472
[3] I-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification
Tan, Zhili
Mak, Man-Wai
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1562 - 1566
[4] TELEPHONY TEXT-PROMPTED SPEAKER VERIFICATION USING I-VECTOR REPRESENTATION
Zeinali, Hossein
Kalantari, Elaheh
Sameti, Hossein
Hadian, Hossein
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4839 - 4843
[5] Double Joint Bayesian Modeling of DNN Local I-Vector for Text Dependent Speaker Verification with Random Digit Strings
Shi, Ziqiang
Lin, Huibin
Liu, Liu
Liu, Rujie
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 67 - 71
[6] An I-Vector Backend for Speaker Verification
Kenny, Patrick
Stafylakis, Themos
Alam, Jahangir
Kockmann, Marcel
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2307 - 2311
[7] IMPROVING DNN SPEAKER INDEPENDENCE WITH I-VECTOR INPUTS
Senior, Andrew
Lopez-Moreno, Ignacio
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[8] DNN i-vector Speaker Verification with Short, Text-constrained Test Utterances
Zhong, Jinghua
Hu, Wenping
Soong, Frank
Meng, Helen
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1507 - 1511
[9] Speaker-Phonetic I-Vector Modeling for Text-Dependent Speaker Verification with Random Digit Strings
Yao, Shengyu
Zhou, Ruohua
Zhang, Pengyuan
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (02) : 346 - 354
[10] Pairwise Discriminative Speaker Verification in the I-Vector Space
Cumani, Sandro
Bruemmer, Niko
Burget, Lukas
Laface, Pietro
Plchot, Oldrich
Vasilakakis, Vasileios
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1217 - 1227

← 1 2 3 4 5 →