Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

被引:5
|
作者
Kim, Hansung [1 ]
Remaggi, Luca [2 ]
Dourado, Aloisio [3 ]
de Campos, Teofilo [3 ]
Jackson, Philip J. B. [4 ]
Hilton, Adrian [4 ]
机构
[1] Univ Southampton, ECS, Southampton, Hants, England
[2] Creat Labs UK, London, England
[3] Univ Brasilia, Brasilia, DF, Brazil
[4] Univ Surrey, CVSSP, Guildford, Surrey, England
基金
英国工程与自然科学研究理事会;
关键词
Audio-visual scene reproduction; Scene understanding; 3D reconstruction and completion; Spatial audio; VIRTUAL-REALITY; IMPLEMENTATION; PERCEPTION; FUTURE;
D O I
10.1007/s10055-021-00594-3
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As personalised immersive display systems have been intensely explored in virtual reality (VR), plausible 3D audio corresponding to the visual content is required to provide more realistic experiences to users. It is well known that spatial audio synchronised with visual information improves a sense of immersion but limited research progress has been achieved in immersive audio-visual content production and reproduction. In this paper, we propose an end-to-end pipeline to simultaneously reconstruct 3D geometry and acoustic properties of the environment from a pair of omnidirectional panoramic images. A semantic scene reconstruction and completion method using a deep convolutional neural network is proposed to estimate the complete semantic scene geometry in order to adapt spatial audio reproduction to the scene. Experiments provide objective and subjective evaluations of the proposed pipeline for plausible audio-visual VR reproduction of real scenes.
引用
收藏
页码:823 / 838
页数:16
相关论文
共 50 条
  • [1] Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras
    Hansung Kim
    Luca Remaggi
    Aloisio Dourado
    Teofilo de Campos
    Philip J. B. Jackson
    Adrian Hilton
    Virtual Reality, 2022, 26 : 823 - 838
  • [2] AVSU: Workshop on Audio-Visual Scene Understanding for Immersive Multimedia
    Hilton, Adrian
    Kang, Hong-Goo
    Kim, Hansung
    Sohn, Kwanghoon
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2122 - 2124
  • [3] EXPANDING AUDIO-VISUAL SCENE
    RICHMOND, JW
    AMERICAN JOURNAL OF ORTHODONTICS, 1965, 51 (04): : 298 - &
  • [4] Improving Semantic Scene Categorization by Exploiting Audio-Visual Features
    Zhu, Songhao
    Yan, Junchi
    Liu, Yuncai
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON IMAGE AND GRAPHICS (ICIG 2009), 2009, : 435 - 440
  • [5] Be Everywhere - Hear Everything (BEE): Audio Scene Reconstruction by Sparse Audio-Visual Samples
    Chen, Mingfei
    Su, Kun
    Shlizerman, Eli
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 7819 - 7828
  • [6] Scene recognition with audio-visual sensor fusion
    Devicharan, D
    Mehrotra, KG
    Mohan, CK
    Varshney, PK
    Zuo, L
    Multisensor, Multisource Information Fusion: Architectures, Algorithms and Applications 2005, 2005, 5813 : 201 - 210
  • [7] Audio-visual technology for conversation scene analysis
    Otsuka, Kazuhiro
    Araki, Shoko
    NTT Technical Review, 2009, 7 (02):
  • [8] Detection of documentary scene changes by audio-visual fusion
    Velivelli, A
    Ngo, CW
    Huang, TS
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2003, 2728 : 227 - 237
  • [9] Effect of Acoustic Scene Complexity and Visual Scene Representation on Auditory Perception in Virtual Audio-Visual Environments
    Fichna, Stefan
    Biberger, Thomas
    Seeber, Bernhard U.
    Ewert, Stephan D.
    2021 IMMERSIVE AND 3D AUDIO: FROM ARCHITECTURE TO AUTOMOTIVE (I3DA), 2021,
  • [10] Scene reconstruction from multiple cameras
    Szeliski, R
    2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2000, : 13 - 16