Wavelet Scattering Transform and CNN for Closed Set Speaker Identification

被引：14

作者：

Ghezaiel, Wajdi ^{[1
]}

Brun, Luc ^{[2
]}

Lezoray, Olivier ^{[2
]}

机构：

[1] Normandie Univ, ENSICAEN, UNICAEN, CNRS,NormaSTIC, F-14000 Caen, France

[2] Normandie Univ, UNICAEN, ENSICAEN, CNRS,Greyc,UMR 6072, F-14000 Caen, France

来源：

2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2020年

关键词：

Speaker identification; short utterances; wavelet scattering transform; convolutional neural network; hybrid network; VERIFICATION;

D O I：

10.1109/mmsp48831.2020.9287061

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In real world applications, the performances of speaker identification systems degrade due to the reduction of both the amount and the quality of speech utterance. For that particular purpose, we propose a speaker identification system where short utterances with few training examples are used for person identification. Therefore, only a very small amount of data involving a sentence of 2-4 seconds is used. To achieve this, we propose a novel raw waveform end-to-end convolutional neural network (CNN) for text-independent speaker identification. We use wavelet scattering transform as a fixed initialization of the first layers of a CNN network, and learn the remaining layers in a supervised manner. The conducted experiments show that our hybrid architecture combining wavelet scattering transform and CNN can successfully perform efficient feature extraction for a speaker identification, even with a small number of short duration training samples.

引用

页数：6

共 50 条

[31] STUDY OF STATISTICAL ROBUST CLOSED SET SPEAKER IDENTIFICATION WITH FEATURE AND SCORE-BASED FUSION
Al-Kaltakchi, Musab T. S.
Woo, Wai L.
Dlay, Satnam S.
Chambers, Jonathon A.
2016 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2016,
[32] Fault detection and identification method: 3D-CNN combined with continuous wavelet transform
Ukawa, Chinatsu
Yamashita, Yoshiyuki
COMPUTERS & CHEMICAL ENGINEERING, 2024, 189
[33] Heart Murmur and Abnormal PCG Detection via Wavelet Scattering Transform and 1D-CNN
Patwa, Ahmed
Rahman, Muhammad Mahboob Ur
Al-Naffouri, Tareq Y.
IEEE SENSORS JOURNAL, 2025, 25 (07) : 12430 - 12443
[34] Wavelet Transform Based Multistage Speaker Feature Tracking Identification System Using Linear Prediction Coefficient
Daqrouq, Khaled
Al-Qawasmi, Abdel-Rahman
Al-Sawalmeh, Wael
Abu Hilal, Tareq
2009 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTATIONAL TOOLS FOR ENGINEERING APPLICATIONS, 2009, : 173 - +
[35] Text-independent speaker identification system using discrete wavelet transform with linear prediction coding
Othman Alrusaini
Khaled Daqrouq
Journal of Umm Al-Qura University for Engineering and Architecture, 2024, 15 (2): : 112 - 119
[36] Detection of Korotkoff Sounds Using Wavelet Transform and CNN
Lee, Bomi
Min, Changhee
Jeong, Jae-Hak
Hong, Junki
Park, Yong-Hwa
TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS B, 2023, 47 (08) : 423 - 427
[37] Average framing linear prediction coding with wavelet transform for text-independent speaker identification system
Daqrouq, Khaled
Al Azzawi, Khalooq Y.
COMPUTERS & ELECTRICAL ENGINEERING, 2012, 38 (06) : 1467 - 1479
[38] WAVELET TRANSFORM IN SCATTERING DATA INTERPOLATION
YAOU, MH
CHANG, WT
ELECTRONICS LETTERS, 1993, 29 (21) : 1835 - 1837
[39] Biometric Speaker Recognition Using Neural Networks and Wavelet Transform
Daghbosheh, Mohammed
Hattab, Ezz
Bisher, Ahmad
2011 INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND INFORMATION TECHNOLOGY (CEIT 2011), 2011, : 1 - 8
[40] Improved Closed Set Text Independent Speaker Identification System using Gammachirp Filterbank in Noisy Environments
Ben Abdallah, Amina
Hajaiej, Zied
2014 11TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2014,

← 1 2 3 4 5 →