Wavelet Scattering Transform and CNN for Closed Set Speaker Identification

被引:14
|
作者
Ghezaiel, Wajdi [1 ]
Brun, Luc [2 ]
Lezoray, Olivier [2 ]
机构
[1] Normandie Univ, ENSICAEN, UNICAEN, CNRS,NormaSTIC, F-14000 Caen, France
[2] Normandie Univ, UNICAEN, ENSICAEN, CNRS,Greyc,UMR 6072, F-14000 Caen, France
关键词
Speaker identification; short utterances; wavelet scattering transform; convolutional neural network; hybrid network; VERIFICATION;
D O I
10.1109/mmsp48831.2020.9287061
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In real world applications, the performances of speaker identification systems degrade due to the reduction of both the amount and the quality of speech utterance. For that particular purpose, we propose a speaker identification system where short utterances with few training examples are used for person identification. Therefore, only a very small amount of data involving a sentence of 2-4 seconds is used. To achieve this, we propose a novel raw waveform end-to-end convolutional neural network (CNN) for text-independent speaker identification. We use wavelet scattering transform as a fixed initialization of the first layers of a CNN network, and learn the remaining layers in a supervised manner. The conducted experiments show that our hybrid architecture combining wavelet scattering transform and CNN can successfully perform efficient feature extraction for a speaker identification, even with a small number of short duration training samples.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Closed-set speaker identification using VQ and GMM based models
    Bidhan Barai
    Tapas Chakraborty
    Nibaran Das
    Subhadip Basu
    Mita Nasipuri
    International Journal of Speech Technology, 2022, 25 : 173 - 196
  • [22] Multi-feature Fusion for Closed Set Text Independent Speaker Identification
    Verma, Gyanendra K.
    INFORMATION INTELLIGENCE, SYSTEMS, TECHNOLOGY AND MANAGEMENT, 2011, 141 : 170 - 179
  • [23] Seismic Fault Interpretation Using 3-D Scattering Wavelet Transform CNN
    Shen, Shian
    Li, Haishan
    Chen, Wenchao
    Wang, Xiaokai
    Huang, Binke
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [24] Multistage Speaker Feature Tracking Identification System Based on Continuous and Discrete Wavelet Transform
    Al-Sawalmeh, Wael
    Daqrouq, Khaled
    Al-Qawasmi, Abdel-Rahman
    MUSP '06: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON MULTIMEDIA SYSTEMS AND SIGNAL PROCESSING, 2009, : 30 - +
  • [25] Fusion of a complementary feature set with MFCC for improved closed set text-independent speaker identification
    Chakroborty, Sandipan
    Roy, Anindya
    Saha, Goutam
    2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 2914 - +
  • [26] ECG Biometric Identification Using Phase Transform and Wavelet Scattering Network
    Li, Shixin
    Shao, Yong
    PROCEEDINGS OF 2023 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE FOR MEDICINE SCIENCE, ISAIMS 2023, 2023, : 209 - 212
  • [27] Exploiting Wavelet Scattering Transform and 1D-CNN for Unmanned Aerial Vehicle Detection
    Ali, Murtiza
    Nathwani, Karan
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1790 - 1794
  • [28] Wavelet-based speaker identification
    Bovbel, EL
    Kheidorov, IE
    Chaikou, YA
    DSP 2002: 14TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING PROCEEDINGS, VOLS 1 AND 2, 2002, : 1005 - 1008
  • [29] Adaptive frequency transform for speaker identification
    Li, Yan-Ping
    Tang, Zhen-Min
    Ding, Hui
    Zhang, Yan
    Nanjing Li Gong Daxue Xuebao/Journal of Nanjing University of Science and Technology, 2010, 34 (02): : 182 - 186
  • [30] A methodology based on wavelet packet for speaker transform recognition
    Mang, Ya-Dong
    Sun, Fu-Yuan
    2007 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1-4, PROCEEDINGS, 2007, : 767 - 771