Learnable MFCCs for Speaker Verification

被引：5

作者：

Liu, Xuechen ^{[1
,2
]}

Sahidullah, Md ^{[2
]}

Kinnunen, Tomi ^{[1
]}

机构：

[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland

[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2021年

基金：

芬兰科学院;

关键词：

Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;

D O I：

10.1109/ISCAS51556.2021.9401593

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.

引用

页数：5

共 50 条

[31] Improvement of Speaker Vector-Based Speaker Verification
Tadokoro, Naoki
Kosaka, Tetsuo
Kato, Masaharu
Kohda, Masaki
FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 1, PROCEEDINGS, 2009, : 721 - 724
[32] Selection of speaker independent feature for a speaker verification system
Pandit, M
Kittler, J
Matas, J
FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 1034 - 1036
[33] Speaker model and decision threshold updating in speaker verification
Homayounpour, MM
EURASIA-ICT 2002: INFORMATION AND COMMUNICATION TECHNOLOGY, PROCEEDINGS, 2002, 2510 : 1 - 10
[34] Confidence Measures for Speaker Segmentation and their Relation to Speaker Verification
Vaquero, Carlos
Ortega, Alfonso
Villalba, Jesus
Miguel, Antonio
Lleida, Eduardo
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2310 - 2313
[35] Introducing phonetic information to speaker embedding for speaker verification
Yi Liu
Liang He
Jia Liu
Michael T. Johnson
EURASIP Journal on Audio, Speech, and Music Processing, 2019
[36] Speaker verification using signatures
Chatelain, P
ELECTRONICS LETTERS, 1998, 34 (15) : 1472 - 1473
[37] SPEAKER VERIFICATION USING PASSWORDS
HELMS, RE
DODDINGTON, GR
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S96 - S96
[38] REVERBERATION COMPENSATION FOR SPEAKER VERIFICATION
Peer, Itai
Rafaely, Boaz
Zigel, Yaniv
2008 IEEE 25TH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, VOLS 1 AND 2, 2008, : 333 - +
[39] Multimodal Association for Speaker Verification
Shon, Suwon
Glass, James
INTERSPEECH 2020, 2020, : 2247 - 2251
[40] Robust speaker identification and verification
Wang, Jia-Ching
Yang, Chung-Hsien
Wang, Jhing-Fa
Lee, Hsiao-Ping
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2007, 2 (02) : 52 - 59

← 1 2 3 4 5 →