FILTERED NOISE SHAPING FOR TIME DOMAIN ROOM IMPULSE RESPONSE ESTIMATION FROM REVERBERANT SPEECH

被引:16
|
作者
Steinmetz, Christian J. [1 ,2 ]
Ithapu, Vamsi Krishna [2 ]
Calamia, Paul [2 ]
机构
[1] Queen Mary Univ London, Ctr Digital Mus, London, England
[2] Facebook Real Labs Res, Redmond, WA USA
关键词
Room impulse response; acoustic matching; reverberation; synthesis; blind estimation;
D O I
10.1109/WASPAA52581.2021.9632680
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning approaches have emerged that aim to transform an audio signal so that it sounds as if it was recorded in the same room as a reference recording, with applications both in audio post-production and augmented reality. In this work, we propose FiNS, a Filtered Noise Shaping network that directly estimates the time domain room impulse response (RIR) from reverberant speech. Our domain-inspired architecture features a time domain encoder and a filtered noise shaping decoder that models the RIR as a summation of decaying filtered noise signals, along with direct sound and early reflection components. Previous methods for acoustic matching utilize either large models to transform audio to match the target room or predict parameters for algorithmic reverberators. Instead, blind estimation of the RIR enables efficient and realistic transformation with a single convolution. An evaluation demonstrates our model not only synthesizes RIRs that match parameters of the target room, such as the T-60 and DRR, but also more accurately reproduces perceptual characteristics of the target room, as shown in a listening test when compared to deep learning baselines.
引用
收藏
页码:221 / 225
页数:5
相关论文
共 50 条
  • [21] OPTIMUM ESTIMATION OF IMPULSE RESPONSE IN THE PRESENCE OF NOISE
    LEVIN, MJ
    PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1959, 47 (03): : 473 - 473
  • [22] Time difference of arrival estimation of speech source in a noisy and reverberant environment
    Dvorkind, TG
    Gannot, S
    SIGNAL PROCESSING, 2005, 85 (01) : 177 - 204
  • [23] SPEECH DEREVERBERATION USING NMF WITH REGULARIZED ROOM IMPULSE RESPONSE
    Mohanan, Nikhil
    Velmurugan, Rajbabu
    Rao, Preeti
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4955 - 4959
  • [24] A hybrid MLS technique for room impulse response estimation
    Paulo, Joel Preto
    Martins, Carlos Rodrigues
    Bento Coelho, J. L.
    APPLIED ACOUSTICS, 2009, 70 (04) : 556 - 562
  • [25] Low-Rank Room Impulse Response Estimation
    Jalmby, Martin
    Elvander, Filip
    van Waterschoot, Toon
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 957 - 969
  • [26] Room impulse response reshaping-based expectation-maximization in an underdetermined reverberant environment
    Xie, Yuan
    Zou, Tao
    Yang, Junjie
    Sun, Weijun
    Xie, Shengli
    COMPUTER SPEECH AND LANGUAGE, 2024, 88
  • [27] Prediction of Reverberant Properties of Enclosures via a Method Employing a Modal Representation of the Room Impulse Response
    Meissner, Miroslaw
    ARCHIVES OF ACOUSTICS, 2016, 41 (01) : 27 - 41
  • [28] Method for time-of-flight estimation of low frequency acoustic signals in reverberant and noisy environment with sparse impulse response
    Elfering, Michael
    Annas, Sven
    Jantzen, Hans-Arno
    Janoske, Uwe
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (04)
  • [29] Computing Impulse Response of Room Acoustics Using the Ray-Tracing Method in Time Domain
    Alpkocak, Adil
    Sis, Malik Kemal
    ARCHIVES OF ACOUSTICS, 2010, 35 (04) : 505 - 519
  • [30] Room Boundary Estimation from Acoustic Room Impulse Responses
    Remaggi, Luca
    Jackson, Philip J. B.
    Coleman, Philip
    Wang, Wenwu
    2014 SENSOR SIGNAL PROCESSING FOR DEFENCE (SSPD), 2014,