DIMINISHING DOMAIN MISMATCH FOR DNN-BASED ACOUSTIC DISTANCE ESTIMATION VIA STOCHASTIC ROOM REVERBERATION MODELS

被引:0
|
作者
Gburrek, Tobias [1 ]
Meise, Adrian [1 ]
Schmalenstroeer, Joerg [1 ]
Haeb-Limbach, Reinhold [1 ]
机构
[1] Paderborn Univ, Dept Commun Engn, Paderborn, Germany
关键词
acoustic distance estimation; room impulse response simulation; stochastic room impulse responses;
D O I
10.1109/IWAENC61483.2024.10694103
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The room impulse response (RIR) encodes, among others, information about the distance of an acoustic source from the sensors. Deep neural networks (DNNs) have been shown to be able to extract that information for acoustic distance estimation. Since there exists only a very limited amount of annotated data, e.g., RIRs with distance information, training a DNN for acoustic distance estimation has to rely on simulated RIRs, resulting in an unavoidable mismatch to RIRs of real rooms. In this contribution, we show that this mismatch can be reduced by a novel combination of geometric and stochastic modeling of RIRs, resulting in a significantly improved distance estimation accuracy.
引用
收藏
页码:279 / 283
页数:5
相关论文
共 10 条
  • [1] DOMAIN EXPANSION IN DNN-BASED ACOUSTIC MODELS FOR ROBUST SPEECH RECOGNITION
    Ghorbani, Shahram
    Khorram, Soheil
    Hansen, John H. L.
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 107 - 113
  • [2] Online Phase Reconstruction via DNN-based Phase Differences Estimation
    Masuyama, Yoshiki
    Yatabe, Kohei
    Nagatomo, Kento
    Oikawa, Yasuhiro
    arXiv, 2022,
  • [3] Online Phase Reconstruction via DNN-Based Phase Differences Estimation
    Masuyama, Yoshiki
    Yatabe, Kohei
    Nagatomo, Kento
    Oikawa, Yasuhiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 163 - 176
  • [4] Speaker verification using short utterances with DNN-based estimation of subglottal acoustic features
    Guo, Jinxi
    Yeung, Gary
    Muralidharan, Deepak
    Arsikere, Harish
    Afshan, Amber
    Alwan, Abeer
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2219 - 2222
  • [5] MULTI-MODAL SERVICE OPERATION ESTIMATION USING DNN-BASED ACOUSTIC BAG-OF-FEATURES
    Tamura, Satoshi
    Uno, Takuya
    Takehara, Masanori
    Hayamizu, Satoru
    Kurata, Takeshi
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2291 - 2295
  • [6] VOCAL MELODY EXTRACTION VIA DNN-BASED PITCH ESTIMATION AND SALIENCE-BASED PITCH REFINEMENT
    Gao, Yongwei
    Zhu, Bilei
    Li, Wei
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1000 - 1004
  • [7] Reducing mismatch in training of DNN-based glottal excitation models in a statistical parametric text-to-speech system
    Juvela, Lauri
    Bollepalli, Bajibabu
    Yamagishi, Junichi
    Alku, Paavo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1368 - 1372
  • [8] Mixed-Bandwidth Cross-Channel Speech Recognition via Joint Optimization of DNN-Based Bandwidth Expansion and Acoustic Modeling
    Gao, Jianqing
    Du, Jun
    Chen, Enhong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (03) : 559 - 571
  • [9] ON THE APPLICABILITY OF DNN-BASED MODELS TRAINED UNDER THE FIXED SPAR CONDITION TO CONTROL FORCE ESTIMATION OF A TLP MOORED PA-WEC
    Kawagishi, Tatsuki
    Murai, Motohiko
    Han, Jialin
    PROCEEDINGS OF ASME 2024 43RD INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, OMAE2024, VOL 4, 2024,
  • [10] Remaining useful life estimation of bearings under different working conditions via Wasserstein distance-based weighted domain adaptation
    Hu, Tao
    Guo, Yiming
    Gu, Liudong
    Zhou, Yifan
    Zhang, Zhisheng
    Zhou, Zhiting
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2022, 224