Wavelet Scattering Transform and CNN for Closed Set Speaker Identification

被引：14

作者：

Ghezaiel, Wajdi ^{[1
]}

Brun, Luc ^{[2
]}

Lezoray, Olivier ^{[2
]}

机构：

[1] Normandie Univ, ENSICAEN, UNICAEN, CNRS,NormaSTIC, F-14000 Caen, France

[2] Normandie Univ, UNICAEN, ENSICAEN, CNRS,Greyc,UMR 6072, F-14000 Caen, France

来源：

2020 IEEE 22ND INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2020年

关键词：

Speaker identification; short utterances; wavelet scattering transform; convolutional neural network; hybrid network; VERIFICATION;

D O I：

10.1109/mmsp48831.2020.9287061

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In real world applications, the performances of speaker identification systems degrade due to the reduction of both the amount and the quality of speech utterance. For that particular purpose, we propose a speaker identification system where short utterances with few training examples are used for person identification. Therefore, only a very small amount of data involving a sentence of 2-4 seconds is used. To achieve this, we propose a novel raw waveform end-to-end convolutional neural network (CNN) for text-independent speaker identification. We use wavelet scattering transform as a fixed initialization of the first layers of a CNN network, and learn the remaining layers in a supervised manner. The conducted experiments show that our hybrid architecture combining wavelet scattering transform and CNN can successfully perform efficient feature extraction for a speaker identification, even with a small number of short duration training samples.

引用

页数：6

共 50 条

[21] Closed-set speaker identification using VQ and GMM based models
Bidhan Barai
Tapas Chakraborty
Nibaran Das
Subhadip Basu
Mita Nasipuri
International Journal of Speech Technology, 2022, 25 : 173 - 196
[22] Multi-feature Fusion for Closed Set Text Independent Speaker Identification
Verma, Gyanendra K.
INFORMATION INTELLIGENCE, SYSTEMS, TECHNOLOGY AND MANAGEMENT, 2011, 141 : 170 - 179
[23] Seismic Fault Interpretation Using 3-D Scattering Wavelet Transform CNN
Shen, Shian
Li, Haishan
Chen, Wenchao
Wang, Xiaokai
Huang, Binke
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[24] Multistage Speaker Feature Tracking Identification System Based on Continuous and Discrete Wavelet Transform
Al-Sawalmeh, Wael
Daqrouq, Khaled
Al-Qawasmi, Abdel-Rahman
MUSP '06: PROCEEDINGS OF THE 9TH WSEAS INTERNATIONAL CONFERENCE ON MULTIMEDIA SYSTEMS AND SIGNAL PROCESSING, 2009, : 30 - +
[25] Fusion of a complementary feature set with MFCC for improved closed set text-independent speaker identification
Chakroborty, Sandipan
Roy, Anindya
Saha, Goutam
2006 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY, VOLS 1-6, 2006, : 2914 - +
[26] ECG Biometric Identification Using Phase Transform and Wavelet Scattering Network
Li, Shixin
Shao, Yong
PROCEEDINGS OF 2023 4TH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE FOR MEDICINE SCIENCE, ISAIMS 2023, 2023, : 209 - 212
[27] Exploiting Wavelet Scattering Transform and 1D-CNN for Unmanned Aerial Vehicle Detection
Ali, Murtiza
Nathwani, Karan
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1790 - 1794
[28] Wavelet-based speaker identification
Bovbel, EL
Kheidorov, IE
Chaikou, YA
DSP 2002: 14TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING PROCEEDINGS, VOLS 1 AND 2, 2002, : 1005 - 1008
[29] Adaptive frequency transform for speaker identification
Li, Yan-Ping
Tang, Zhen-Min
Ding, Hui
Zhang, Yan
Nanjing Li Gong Daxue Xuebao/Journal of Nanjing University of Science and Technology, 2010, 34 (02): : 182 - 186
[30] A methodology based on wavelet packet for speaker transform recognition
Mang, Ya-Dong
Sun, Fu-Yuan
2007 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1-4, PROCEEDINGS, 2007, : 767 - 771

← 1 2 3 4 5 →