INRAS: Implicit Neural Representation for Audio Scenes

被引:0
|
作者
Su, Kun [1 ]
Chen, Mingfei [1 ]
Shlizerman, Eli [1 ,2 ]
机构
[1] Univ Washington, Dept Elect & Comp Engn, Seattle, WA 98195 USA
[2] Univ Washington, Dept Appl Math, Seattle, WA 98195 USA
基金
美国国家科学基金会;
关键词
SOUND-PROPAGATION; EFFICIENT; MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spatial acoustic information of a scene, i.e., how sounds emitted from a particular location in the scene are perceived in another location, is key for immersive scene modeling. Robust representation of scene's acoustics can be formulated through a continuous field formulation along with impulse responses varied by emitter-listener locations. The impulse responses are then used to render sounds perceived by the listener. While such representation is advantageous, parameterization of impulse responses for generic scenes presents itself as a challenge. Indeed, traditional pre-computation methods have only implemented parameterization at discrete probe points and require large storage, while other existing methods such as geometry-based sound simulations still suffer from inability to simulate all wave-based sound effects. In this work, we introduce a novel neural network for light-weight Implicit Neural Representation for Audio Scenes (INRAS), which can render a high fidelity time-domain impulse responses at any arbitrary emitter-listener positions by learning a continuous implicit function. INRAS disentangles scene's geometry features with three modules to generate independent features for the emitter, the geometry of the scene, and the listener respectively. These lead to an efficient reuse of scene-dependent features and support effective multi-condition training for multiple scenes. Our experimental results show that INRAS outperforms existing approaches for representation and rendering of sounds for varying emitter-listener locations in all aspects, including the impulse response quality, inference speed, and storage requirements.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Implicit Neural Visual Representation Compression of 3D Scenes
    Bang, Gun
    Do, Jihoon
    Kang, Jung Won
    Bea, Seong-Jun
    Lee, Hahyun
    Lee, Jinho
    Kim, Soowoong
    [J]. INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592
  • [2] Audio-guided implicit neural representation for local image stylization
    Lee, Seung Hyun
    Kim, Sieun
    Byeon, Wonmin
    Oh, Gyeongrok
    In, Sumin
    Park, Hyeongcheol
    Yoon, Sang Ho
    Hong, Sung-Hee
    Kim, Jinkyu
    Kim, Sangpil
    [J]. COMPUTATIONAL VISUAL MEDIA, 2024, : 1185 - 1204
  • [3] Three-Dimensional Reconstruction of Indoor Scenes Based on Implicit Neural Representation
    Lin, Zhaoji
    Huang, Yutao
    Yao, Li
    [J]. Journal of Imaging, 2024, 10 (09)
  • [4] Surface Normal Clustering for Implicit Representation of Manhattan Scenes
    Popovic, Nikola
    Paudel, Danda Pani
    Van Gool, Luc
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 17814 - 17824
  • [5] Audio-visual aligned saliency model for omnidirectional video with implicit neural representation learning
    Zhu, Dandan
    Shao, Xuan
    Zhang, Kaiwei
    Min, Xiongkuo
    Zhai, Guangtao
    Yang, Xiaokang
    [J]. APPLIED INTELLIGENCE, 2023, 53 (19) : 22615 - 22634
  • [6] Audio-visual aligned saliency model for omnidirectional video with implicit neural representation learning
    Dandan Zhu
    Xuan Shao
    Kaiwei Zhang
    Xiongkuo Min
    Guangtao Zhai
    Xiaokang Yang
    [J]. Applied Intelligence, 2023, 53 : 22615 - 22634
  • [7] Neural explicit and implicit knowledge representation
    Neagu, CD
    Palade, V
    [J]. KES'2000: FOURTH INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED INTELLIGENT ENGINEERING SYSTEMS & ALLIED TECHNOLOGIES, VOLS 1 AND 2, PROCEEDINGS, 2000, : 213 - 216
  • [8] Regularize implicit neural representation by itself
    Li, Zhemin
    Wang, Hongxia
    Meng, Deyu
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10280 - 10288
  • [9] MINER: Multiscale Implicit Neural Representation
    Saragadam, Vishwanath
    Tan, Jasper
    Balakrishnan, Guha
    Baraniuk, Richard G.
    Veeraraghavan, Ashok
    [J]. COMPUTER VISION, ECCV 2022, PT XXIII, 2022, 13683 : 318 - 333
  • [10] Neural explicit and implicit knowledge representation
    Neagu, Ciprian-Daniel
    Palade, Vasile
    [J]. International Conference on Knowledge-Based Intelligent Electronic Systems, Proceedings, KES, 2000, 1 : 213 - 216