Learnable MFCCs for Speaker Verification

被引:5
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [2 ]
Kinnunen, Tomi [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Joensuu, Finland
[2] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
基金
芬兰科学院;
关键词
Speaker verification; feature extraction; mel-frequency cesptral coefficients (MFCCs); RECOGNITION; FEATURES;
D O I
10.1109/ISCAS51556.2021.9401593
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose a learnable mel-frequency cepstral coefficients (MFCCs) front-end architecture for deep neural network (DNN) based automatic speaker verification. Our architecture retains the simplicity and interpretability of MFCC-based features while allowing the model to be adapted to data flexibly. In practice, we formulate data-driven version of four linear transforms in a standard MFCC extractor - windowing, discrete Fourier transform (DFT), mel filterbank and discrete cosine transform (DCT). Results reported reach up to 6.7% (VoxCeleb1) and 9.7% (SITW) relative improvement in term of equal error rate (EER) from static MFCCs, without additional tuning effort.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Improvement of Speaker Vector-Based Speaker Verification
    Tadokoro, Naoki
    Kosaka, Tetsuo
    Kato, Masaharu
    Kohda, Masaki
    FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 1, PROCEEDINGS, 2009, : 721 - 724
  • [32] Selection of speaker independent feature for a speaker verification system
    Pandit, M
    Kittler, J
    Matas, J
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 1034 - 1036
  • [33] Speaker model and decision threshold updating in speaker verification
    Homayounpour, MM
    EURASIA-ICT 2002: INFORMATION AND COMMUNICATION TECHNOLOGY, PROCEEDINGS, 2002, 2510 : 1 - 10
  • [34] Confidence Measures for Speaker Segmentation and their Relation to Speaker Verification
    Vaquero, Carlos
    Ortega, Alfonso
    Villalba, Jesus
    Miguel, Antonio
    Lleida, Eduardo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2310 - 2313
  • [35] Introducing phonetic information to speaker embedding for speaker verification
    Yi Liu
    Liang He
    Jia Liu
    Michael T. Johnson
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [36] Speaker verification using signatures
    Chatelain, P
    ELECTRONICS LETTERS, 1998, 34 (15) : 1472 - 1473
  • [37] SPEAKER VERIFICATION USING PASSWORDS
    HELMS, RE
    DODDINGTON, GR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 : S96 - S96
  • [38] REVERBERATION COMPENSATION FOR SPEAKER VERIFICATION
    Peer, Itai
    Rafaely, Boaz
    Zigel, Yaniv
    2008 IEEE 25TH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, VOLS 1 AND 2, 2008, : 333 - +
  • [39] Multimodal Association for Speaker Verification
    Shon, Suwon
    Glass, James
    INTERSPEECH 2020, 2020, : 2247 - 2251
  • [40] Robust speaker identification and verification
    Wang, Jia-Ching
    Yang, Chung-Hsien
    Wang, Jhing-Fa
    Lee, Hsiao-Ping
    IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2007, 2 (02) : 52 - 59