Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks

被引：1

作者：

Salvati, Daniele ^{[1
]}

Drioli, Carlo ^{[1
]}

Foresti, Gian Luca ^{[1
]}

机构：

[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy

来源：

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年

关键词：

Convolutional neural network; end-to-end system; raw waveform; speaker identification; speaker localization; two-microphone array; ACOUSTIC SOURCE LOCALIZATION; NOISY;

D O I：

10.1109/ijcnn48605.2020.9206674

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an end-to-end scheme based on convolutional neural networks (CNNs) for speaker joint identification and localization. We investigate the possibility to estimate both the direction of arrival (DOA) and the identity of the speaker in far-field noisy and reverberant conditions using a two-channel microphone array. The proposed CNN network is designed to map the raw waveform of the two channels into the speaker identity and into the DOA of its speech signal. We analyze the identification and localization performance with simulated experiments in noisy and reverberation conditions.

引用

页数：6

共 50 条

[41] Jasper: An End-to-End Convolutional Neural Acoustic Model
Li, Jason
Lavrukhin, Vitaly
Ginsburg, Boris
Leary, Ryan
Kuchaiev, Oleksii
Cohen, Jonathan M.
Nguyen, Huyen
Gadde, Ravi Teja
INTERSPEECH 2019, 2019, : 71 - 75
[42] Robust End-to-end Speaker Diarization with Generic Neural Clustering
Yang, Chenyu
Wang, Yu
INTERSPEECH 2022, 2022, : 1471 - 1475
[43] End-To-End Phonetic Neural Network Approach for Speaker Verification
Demirbag, Sedat
Erden, Mustafa
Arslan, Levent
2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,
[44] END-TO-END NEURAL SPEAKER DIARIZATION WITH SELF-ATTENTION
Fujita, Yusuke
Kanda, Naoyuki
Horiguchi, Shota
Xue, Yawen
Nagamatsu, Kenji
Watanabe, Shinji
2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 296 - 303
[45] End-to-End Audio-Visual Neural Speaker Diarization
He, Mao-kui
Du, Jun
Lee, Chin-Hui
INTERSPEECH 2022, 2022, : 1461 - 1465
[46] End-to-end recognition of slab identification numbers using a deep convolutional neural network
Lee, Sang Jun
Yun, Jong Pil
Koo, Gyogwon
Kim, Sang Woo
KNOWLEDGE-BASED SYSTEMS, 2017, 132 : 1 - 10
[47] End-to-end video background subtraction with 3d convolutional neural networks
Sakkos, Dimitrios
Liu, Heng
Han, Jungong
Shao, Ling
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 23023 - 23041
[48] Streaming Convolutional Neural Networks for End-to-End Learning With Multi-Megapixel Images
Pinckaers, Hans
van Ginneken, Bram
Litjens, Geert
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) : 1581 - 1590
[49] END-TO-END PHOTOPLETHYSMOGRAPHY (PPG) BASED BIOMETRIC AUTHENTICATION BY USING CONVOLUTIONAL NEURAL NETWORKS
Luque, Jordi
Cortes, Guillem
Segura, Carlos
Maravilla, Alexandre
Esteban, Javier
Fabregat, Joan
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 538 - 542
[50] End-to-End Single Image Super-Resolution Based on Convolutional Neural Networks
Ferariu, Lavinia
Beti, Iosif-Alin
2022 26TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2022, : 277 - 282

← 1 2 3 4 5 →