Real-Time Convolutional Neural Network-Based Speech Source Localization on Smartphone

被引:0
|
作者
Kucuk, Abdullah [1 ]
Ganguly, Anshuman [1 ]
Hao, Yiya [1 ]
Panahi, Issa M. S. [1 ]
机构
[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA
基金
美国国家卫生研究院;
关键词
Direction-of-arrival estimation; Microphones; Real-time systems; Estimation; Noise measurement; Convolution; Speech processing; Convolutional neural network; speech source localization (SSL); smartphone; direction of arrival (DOA); SOUND LOCALIZATION; ARRIVAL ESTIMATION; HEARING; ENHANCEMENT; DIFFERENCE; MODEL;
D O I
10.1109/ACCESS.2019.2955049
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a real-time convolutional neural network (CNN) based approach for speech source localization (SSL) using Android-based smartphone and its two built-in microphones under noisy conditions. We propose a new input feature set using real and imaginary parts of the short-time Fourier transform (STFT) for CNN-based SSL. We use simulated noisy data from popular datasets that was augmented with few hours of real recordings collected on smartphones to train our CNN model. We compare the proposed method to recent CNN-based SSL methods that are trained on our dataset and show that our CNN-based SSL method offers higher accuracy on identical test datasets. Another unique aspect of this work is that we perform real-time inferencing of our CNN model on an Android smartphone with low latency (14 milliseconds(ms) for single frame-based estimation, 180 ms for multi frame-based estimation and frame length is 20 ms for both cases) and high accuracy (i.e. 88.83 at 0dB SNR). We show that our CNN model is rather robust to smartphone hardware mismatch, hence we may not need to retrain the entire model again for use with different smartphones. The proposed application provides a visual indication of the direction of a talker on the screen of Android smartphones for improving the hearing of people with hearing disorders.
引用
收藏
页码:169969 / 169978
页数:10
相关论文
共 50 条
  • [1] Spectral Flux-Based Convolutional Neural Network Architecture for Speech Source Localization and its Real-Time Implementation
    Hao, Yiya
    Kucuk, Abdullah
    Ganguly, Anshuman
    Panahi, Issa M. S.
    [J]. IEEE ACCESS, 2020, 8 : 197047 - 197058
  • [2] A Real-Time Convolutional Neural Network Based Speech Enhancement for Hearing Impaired Listeners Using Smartphone
    Bhat, Gautam S.
    Shankar, Nikhil
    Reddy, Chandan K. A.
    Panahi, Issa M. S.
    [J]. IEEE ACCESS, 2019, 7 : 78421 - 78433
  • [3] Real-Time Speech Enhancement Based on Convolutional Recurrent Neural Network
    Girirajan, S.
    Pandian, A.
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 1987 - 2001
  • [4] A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
    Tan, Ke
    Wang, DeLiang
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3229 - 3233
  • [5] Convolutional Neural Network-Based Signal Classification in Real Time
    Cetin, Ramazan
    Gecgel, Selen
    Kurt, Gunes Karabulut
    Baskaya, Faik
    [J]. IEEE EMBEDDED SYSTEMS LETTERS, 2021, 13 (04) : 186 - 189
  • [6] A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection
    Sehgal, Abhishek
    Kehtarnavaz, Nasser
    [J]. IEEE ACCESS, 2018, 6 : 9017 - 9026
  • [7] TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN
    Pandey, Ashutosh
    Wang, DeLiang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6875 - 6879
  • [8] Convolutional Neural Network-based UWB System Localization
    Doan Tan Anh Nguyen
    Lee, Han-Gyeol
    Joung, Jingon
    Jeong, Eui-Rim
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 488 - 490
  • [9] Real-Time Tool Localization for Laparoscopic Surgery Using Convolutional Neural Network
    Benavides, Diego
    Cisnal, Ana
    Fonturbel, Carlos
    de la Fuente, Eusebio
    Fraile, Juan Carlos
    [J]. SENSORS, 2024, 24 (13)
  • [10] Convolutional Neural Network-Based Real-Time Object Detection and Tracking for Parrot AR Drone 2
    Rohan, Ali
    Rabah, Mohammed
    Kim, Sung-Ho
    [J]. IEEE ACCESS, 2019, 7 : 69575 - 69584