Wavelet Scattering Transform and CNN for Closed Set Speaker Identification

被引:14
|
作者
Ghezaiel, Wajdi [1 ]
Brun, Luc [2 ]
Lezoray, Olivier [2 ]
机构
[1] Normandie Univ, ENSICAEN, UNICAEN, CNRS,NormaSTIC, F-14000 Caen, France
[2] Normandie Univ, UNICAEN, ENSICAEN, CNRS,Greyc,UMR 6072, F-14000 Caen, France
关键词
Speaker identification; short utterances; wavelet scattering transform; convolutional neural network; hybrid network; VERIFICATION;
D O I
10.1109/mmsp48831.2020.9287061
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In real world applications, the performances of speaker identification systems degrade due to the reduction of both the amount and the quality of speech utterance. For that particular purpose, we propose a speaker identification system where short utterances with few training examples are used for person identification. Therefore, only a very small amount of data involving a sentence of 2-4 seconds is used. To achieve this, we propose a novel raw waveform end-to-end convolutional neural network (CNN) for text-independent speaker identification. We use wavelet scattering transform as a fixed initialization of the first layers of a CNN network, and learn the remaining layers in a supervised manner. The conducted experiments show that our hybrid architecture combining wavelet scattering transform and CNN can successfully perform efficient feature extraction for a speaker identification, even with a small number of short duration training samples.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] STUDY OF STATISTICAL ROBUST CLOSED SET SPEAKER IDENTIFICATION WITH FEATURE AND SCORE-BASED FUSION
    Al-Kaltakchi, Musab T. S.
    Woo, Wai L.
    Dlay, Satnam S.
    Chambers, Jonathon A.
    2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016,
  • [32] Fault detection and identification method: 3D-CNN combined with continuous wavelet transform
    Ukawa, Chinatsu
    Yamashita, Yoshiyuki
    COMPUTERS & CHEMICAL ENGINEERING, 2024, 189
  • [33] Heart Murmur and Abnormal PCG Detection via Wavelet Scattering Transform and 1D-CNN
    Patwa, Ahmed
    Rahman, Muhammad Mahboob Ur
    Al-Naffouri, Tareq Y.
    IEEE SENSORS JOURNAL, 2025, 25 (07) : 12430 - 12443
  • [34] Wavelet Transform Based Multistage Speaker Feature Tracking Identification System Using Linear Prediction Coefficient
    Daqrouq, Khaled
    Al-Qawasmi, Abdel-Rahman
    Al-Sawalmeh, Wael
    Abu Hilal, Tareq
    2009 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTATIONAL TOOLS FOR ENGINEERING APPLICATIONS, 2009, : 173 - +
  • [35] Text-independent speaker identification system using discrete wavelet transform with linear prediction coding
    Othman Alrusaini
    Khaled Daqrouq
    Journal of Umm Al-Qura University for Engineering and Architecture, 2024, 15 (2): : 112 - 119
  • [36] Detection of Korotkoff Sounds Using Wavelet Transform and CNN
    Lee, Bomi
    Min, Changhee
    Jeong, Jae-Hak
    Hong, Junki
    Park, Yong-Hwa
    TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS B, 2023, 47 (08) : 423 - 427
  • [37] Average framing linear prediction coding with wavelet transform for text-independent speaker identification system
    Daqrouq, Khaled
    Al Azzawi, Khalooq Y.
    COMPUTERS & ELECTRICAL ENGINEERING, 2012, 38 (06) : 1467 - 1479
  • [38] WAVELET TRANSFORM IN SCATTERING DATA INTERPOLATION
    YAOU, MH
    CHANG, WT
    ELECTRONICS LETTERS, 1993, 29 (21) : 1835 - 1837
  • [39] Biometric Speaker Recognition Using Neural Networks and Wavelet Transform
    Daghbosheh, Mohammed
    Hattab, Ezz
    Bisher, Ahmad
    2011 INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND INFORMATION TECHNOLOGY (CEIT 2011), 2011, : 1 - 8
  • [40] Improved Closed Set Text Independent Speaker Identification System using Gammachirp Filterbank in Noisy Environments
    Ben Abdallah, Amina
    Hajaiej, Zied
    2014 11TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2014,