An experimental comparison of modelling techniques for speaker recognition under limited data condition

被引：13

作者：

Jayanna, H. S. ^{[1
]}

Prasanna, S. R. Mahadeva ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Commun Engn, Gauhati 781039, Assam, India

来源：

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2009年 / 34卷 / 05期

关键词：

Speaker recognition; limited data; CVQ; FVQ; SOM; LVQ; GMM; GMM-UBM; IDENTIFICATION; SPEECH;

D O I：

10.1007/s12046-009-0042-9

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Most of the existing modelling techniques for the speaker recognition task make an implicit assumption of sufficient data for speaker modelling and hence may lead to poor modelling under limited data condition. The present work gives an experimental evaluation of the modelling techniques like Crisp Vector Quantization (CVQ), Fuzzy Vector Quantization (FVQ), Self-Organizing Map (SOM), Learning Vector Quantization (LVQ), and Gaussian Mixture Model (GMM) classifiers. An experimental evaluation of the most widely used Gaussian Mixture Model-Universal Background Model (GMM-UBM) is also made. The experimental knowledge is then used to select a subset of classifiers for obtaining the combined classifiers. It is proposed that the combined LVQ and GMM-UBM classifier provides relatively better performance compared to all the individual as well as combined classifiers.

引用

页码：717 / 728

页数：12

共 50 条

[11] Comparison of whole word and subword modeling techniques for speaker verification with limited training data
Euler, S
Langlitz, R
Zinke, J
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1079 - 1082
[12] SPEAKER RECOGNITION IN NOISY CONDITIONS WITH LIMITED TRAINING DATA
McLaughlin, Niall
Ming, Ji
Crookes, Danny
[J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1294 - 1298
[13] Feature extraction and modelling techniques for multilingual speaker recognition: a review
Nagaraja, B. G.
Jayanna, H. S.
[J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2016, 9 (02) : 67 - 78
[14] VAD, feature extraction and modelling techniques for speaker recognition: a review
Jainar, Spoorti J.
Sale, Pritam Limbaji
Nagaraja, B. G.
[J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2020, 12 (1-2) : 1 - 18
[15] Limited Labels for Unlimited Data: Active Learning for Speaker Recognition
Shum, Stephen H.
Dehak, Najim
Glass, James R.
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 383 - 387
[16] System Source and Dynamic Features for Speaker Verification for Limited Data Condition
Kumari, T. R. Jayanthi
Jayanna, H. S.
[J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2016, : 1458 - 1461
[17] Text Dependent Speaker Verification using Algebraic Approach (AA) method and DTW under limited data condition
Paul, Suman
Misra, Songhita
Das, Tushar Kanti
Laskar, Rabul Hussain
[J]. 2015 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2015,
[18] Significance of Glottal Activity Detection for Speaker Verification in Degraded and Limited Data Condition
Pandey, Ashutosh
Das, Rohan Kumar
Adiga, Nagaraj
Gupta, Naresh
Prasanna, S. R. Mahadeva
[J]. TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
[19] A Comparison of Speaker Clustering and Speech Recognition Techniques for Air Situational Awareness
Shen, Wade
Reynolds, Douglas
[J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2412 - 2415
[20] DATA SAMPLING ENSEMBLE ACOUSTIC MODELLING IN SPEAKER INDEPENDENT SPEECH RECOGNITION
Chen, Xin
Zhao, Yunxin
[J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5130 - 5133

← 1 2 3 4 5 →