Two-Microphone End-to-End Speaker Joint Identification and Localization Via Convolutional Neural Networks

被引：1

作者：

Salvati, Daniele ^{[1
]}

Drioli, Carlo ^{[1
]}

Foresti, Gian Luca ^{[1
]}

机构：

[1] Univ Udine, Dept Math Comp Sci & Phys, Udine, Italy

来源：

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年

关键词：

Convolutional neural network; end-to-end system; raw waveform; speaker identification; speaker localization; two-microphone array; ACOUSTIC SOURCE LOCALIZATION; NOISY;

D O I：

10.1109/ijcnn48605.2020.9206674

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an end-to-end scheme based on convolutional neural networks (CNNs) for speaker joint identification and localization. We investigate the possibility to estimate both the direction of arrival (DOA) and the identity of the speaker in far-field noisy and reverberant conditions using a two-channel microphone array. The proposed CNN network is designed to map the raw waveform of the two channels into the speaker identity and into the DOA of its speech signal. We analyze the identification and localization performance with simulated experiments in noisy and reverberation conditions.

引用

页数：6

共 50 条

[21] Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Parcollet, Titouan
Zhang, Ying
Morchid, Mohamed
Trabelsi, Chiheb
Linares, Georges
De Mori, Renato
Bengio, Yoshua
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 22 - 26
[22] End-to-End Blood Pressure Prediction via Fully Convolutional Networks
Baek, Sanghyun
Jang, Jiyong
Yoon, Sungroh
IEEE ACCESS, 2019, 7 : 185458 - 185468
[23] An End-to-End Real-Time Face Identification and Attendance System using Convolutional Neural Networks
Rai, Aashish
Karnani, Rashmi
Chudasama, Vishal
Upla, Kishor
2019 IEEE 16TH INDIA COUNCIL INTERNATIONAL CONFERENCE (IEEE INDICON 2019), 2019,
[24] TOWARDS END-TO-END SPEAKER DIARIZATION WITH GENERALIZED NEURAL SPEAKER CLUSTERING
Zhang, Chunlei
Shi, Jiatong
Weng, Chao
Yu, Meng
Yu, Dong
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8372 - 8376
[25] Two-Stage Transfer Learning of End-to-End Convolutional Neural Networks for Webpage Saliency Prediction
Shan, Wei
Sun, Guangling
Zhou, Xiaofei
Liu, Zhi
INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, ISCIDE 2017, 2017, 10559 : 316 - 324
[26] Neural PLDA Modeling for End-to-End Speaker Verification
Ramoji, Shreyas
Krishnan, Prashant
Ganapathy, Sriram
INTERSPEECH 2020, 2020, : 4333 - 4337
[27] Image Shadow Removal Using End-To-End Deep Convolutional Neural Networks
Fan, Hui
Han, Meng
Li, Jinjiang
APPLIED SCIENCES-BASEL, 2019, 9 (05):
[28] End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments
Zinemanas, Pablo
Cancela, Pablo
Rocamora, Martin
PROCEEDINGS OF THE 24TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2019, : 533 - 539
[29] CONVOLUTIONAL ANALYSIS OPERATOR LEARNING BY END-TO-END TRAINING OF ITERATIVE NEURAL NETWORKS
Kofler, Andreas
Wald, Christian
Schaeffter, Tobias
Haltmeier, Markus
Kolbitsch, Christoph
2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
[30] Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks
Zhang, Wei
Zhai, Minghao
Huang, Zilong
Liu, Chen
Li, Wei
Cao, Yi
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PART VI, 2019, 11745 : 332 - 341

← 1 2 3 4 5 →