Sound Source Localization Inside a Structure Under Semi-Supervised Conditions

被引:2
|
作者
Kita, Shunsuke [1 ]
Kajikawa, Yoshinobu [2 ]
机构
[1] Osaka Res Inst Ind Sci & Technol, Div Elect & Mech Syst, Osaka 594115, Japan
[2] Kansai Univ, Fac Engn Sci, Osaka 5648680, Japan
关键词
Data models; Adaptation models; Acoustics; Speech processing; Predictive models; Location awareness; Training; Sound source localization; domain transfer; acoustic-structure coupling; t-distributed stochastic neighbor embedding;
D O I
10.1109/TASLP.2023.3263776
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a method for applying a sound source localization (SSL) model trained on simulated data in a real-world environment, with a domain transfer (DT) model for the SSL inside a structure. The DT model transfers real data into pseudo-simulation data. The SSL model trained on the simulation data is then adapted to the real data using the DT model. Our method consists of an SSL model and a DT model. The SSL model predicts the position of a sound source inside the structure, whereas the DT model transforms the data. Because our simulation is not perfect, real data are extrapolated for use with the SSL model. However, the data transformed by the DT model are interpolated within the feature space. The outcome is that the performance of the SSL model in the real world is improved. In our study, the frequency spectra of accelerometers observed on the outer surface of the structure are the model input. The goal is to predict the position of the sound source. The SSL model is built using deep and convolutional neural networks, and the DT model is built using either an autoencoder, a deep convolutional autoencoder, or pix2pix. The two-dimensional distributions of the t-distributed Stochastic Neighbor Embedding indicate that using pix2pix as the DT model shows the best performance. Furthermore, our method's performance for SSL is improved by 57% for the classification problem and by 27% for the regression problem when compared to the case where no transformation is applied.
引用
收藏
页码:1397 / 1408
页数:12
相关论文
共 50 条
  • [1] Semi-Supervised Sound Source Localization Based on Manifold Regularization
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (08) : 1393 - 1407
  • [2] Close Sound Source Localization incorporating Semi-Supervised Variational Bayesian NMF
    Kumon, Makoto
    Washizaki, Kai
    Nakadai, Kazuhiro
    [J]. 2019 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2019, : 313 - 318
  • [3] INTERACTIVE REFINEMENT OF SUPERVISED AND SEMI-SUPERVISED SOUND SOURCE SEPARATION ESTIMATES
    Bryan, Nicholas J.
    Mysore, Gautham J.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 883 - 887
  • [4] SEMI-SUPERVISED SOURCE LOCALIZATION WITH DEEP GENERATIVE MODELING
    Bianco, Michael J.
    Gannot, Sharon
    Gerstoft, Peter
    [J]. PROCEEDINGS OF THE 2020 IEEE 30TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2020,
  • [5] SEMI-SUPERVISED SOURCE LOCALIZATION WITH RESIDUAL PHYSICAL LEARNING
    Bianco, Michael J.
    Gerstoft, Peter
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3608 - 3612
  • [6] Semi-Supervised Source Localization on Multiple Manifolds With Distributed Microphones
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1477 - 1491
  • [7] MANIFOLD-BASED BAYESIAN INFERENCE FOR SEMI-SUPERVISED SOURCE LOCALIZATION
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6335 - 6339
  • [8] Semi-Supervised Source Localization in Reverberant Environments With Deep Generative Modeling
    Bianco, Michael J.
    Gannot, Sharon
    Fernandez-Grande, Efren
    Gerstoft, Peter
    [J]. IEEE ACCESS, 2021, 9 : 84956 - 84970
  • [9] Semi-Supervised Source Localization in Reverberant Environments with Deep Generative Modeling
    Bianco, Michael J.
    Gannot, Sharon
    Fernandez-Grande, Efren
    Gerstoft, Peter
    [J]. IEEE Access, 2021, 9 : 84956 - 84970
  • [10] Semi-supervised protein subcellular localization
    Xu, Qian
    Hu, Derek Hao
    Xue, Hong
    Yu, Weichuan
    Yang, Qiang
    [J]. BMC BIOINFORMATICS, 2009, 10