Optimizing Multi-Taper Features for Deep Speaker Verification

被引:0
|
作者
Liu, Xuechen [1 ,2 ]
Sahidullah, Md [1 ]
Kinnunen, Tomi [2 ]
机构
[1] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[2] Univ Eastern Finland, Sch Comp, FI-80101 Joensuu, Finland
基金
芬兰科学院;
关键词
Feature extraction; Discrete Fourier transforms; Task analysis; Neural networks; Mel frequency cepstral coefficient; Stochastic processes; Standards; Multi-taper spectrum; speaker verification; RECOGNITION; MFCC;
D O I
10.1109/LSP.2021.3122796
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs). Even if past work has reported promising automatic speaker verification (ASV) results with Gaussian mixture model-based classifiers, the performance of multi-taper MFCCs with deep ASV systems remains an open question. Instead of a static-taper design, we propose to optimize the multi-taper estimator jointly with a deep neural network trained for ASV tasks. With a maximum improvement on the SITW corpus of 25.8% in terms of equal error rate over the static-taper, our method helps preserve a balanced level of leakage and variance, providing more robustness.
引用
收藏
页码:2187 / 2191
页数:5
相关论文
共 50 条
  • [1] Multi-Taper Spectral Features for Emotion Recognition from Speech
    Chapaneri, Santosh V.
    Jayaswal, Deepak D.
    [J]. 2015 INTERNATIONAL CONFERENCE ON INDUSTRIAL INSTRUMENTATION AND CONTROL (ICIC), 2015, : 1044 - 1049
  • [2] Speaker Verification with Deep Features
    Liu, Yuan
    Fu, Tianfan
    Fan, Yuchen
    Qian, Yanmin
    Yu, Kai
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 747 - 753
  • [3] Multi-taper implementation of GFDM
    Bandari, Shravan Kumar
    Mani, V. V.
    Drosopoulos, A.
    [J]. 2016 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, 2016,
  • [4] Optimizing Features Extraction Parameters for Speaker Verification
    Impedovo, Donato
    Refice, Mario
    [J]. NEW ASPECTS OF SYSTEMS, PTS I AND II, 2008, : 498 - +
  • [5] Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-taper Spectral Estimation
    Bhat, Chitralekha
    Vachhani, Bhavik
    Kopparapu, Sunil
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 228 - 232
  • [6] A Study of Low-variance Multi-taper Features for Distributed Speech Recognition
    Alam, Md Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    [J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 239 - +
  • [7] An Omnidirectional Antenna with Multi-taper Conformal Structure
    Jiang, Zhaoneng
    Sha, Yongxin
    Xuan, Xiaofeng
    Nie, Liying
    [J]. APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2023, 38 (03): : 184 - 192
  • [8] Speaker Verification based on extraction of Deep Features
    Mitsianis, Evangelos
    Spyrou, Evaggelos
    Giannakopoulos, Theodore
    [J]. 10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
  • [9] Multi-taper method of analysis of periodicities in hydrologic data
    Rao, AR
    Hamed, K
    [J]. JOURNAL OF HYDROLOGY, 2003, 279 (1-4) : 125 - 143
  • [10] Phonetic-Attention Scoring for Deep Speaker Features in Speaker Verification
    Li, Lantian
    Tang, Zhiyuan
    Shi, Ying
    Wang, Dong
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 284 - 288