Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

被引：5

作者：

Kim, Hansung ^{[1
]}

Remaggi, Luca ^{[2
]}

Dourado, Aloisio ^{[3
]}

de Campos, Teofilo ^{[3
]}

Jackson, Philip J. B. ^{[4
]}

Hilton, Adrian ^{[4
]}

机构：

[1] Univ Southampton, ECS, Southampton, Hants, England

[2] Creat Labs UK, London, England

[3] Univ Brasilia, Brasilia, DF, Brazil

[4] Univ Surrey, CVSSP, Guildford, Surrey, England

来源：

VIRTUAL REALITY | 2022年 / 26卷 / 03期

基金：

英国工程与自然科学研究理事会;

关键词：

Audio-visual scene reproduction; Scene understanding; 3D reconstruction and completion; Spatial audio; VIRTUAL-REALITY; IMPLEMENTATION; PERCEPTION; FUTURE;

D O I：

10.1007/s10055-021-00594-3

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

As personalised immersive display systems have been intensely explored in virtual reality (VR), plausible 3D audio corresponding to the visual content is required to provide more realistic experiences to users. It is well known that spatial audio synchronised with visual information improves a sense of immersion but limited research progress has been achieved in immersive audio-visual content production and reproduction. In this paper, we propose an end-to-end pipeline to simultaneously reconstruct 3D geometry and acoustic properties of the environment from a pair of omnidirectional panoramic images. A semantic scene reconstruction and completion method using a deep convolutional neural network is proposed to estimate the complete semantic scene geometry in order to adapt spatial audio reproduction to the scene. Experiments provide objective and subjective evaluations of the proposed pipeline for plausible audio-visual VR reproduction of real scenes.

引用

页码：823 / 838

页数：16

共 50 条

[1] Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras
Hansung Kim
Luca Remaggi
Aloisio Dourado
Teofilo de Campos
Philip J. B. Jackson
Adrian Hilton
Virtual Reality, 2022, 26 : 823 - 838
[2] AVSU: Workshop on Audio-Visual Scene Understanding for Immersive Multimedia
Hilton, Adrian
Kang, Hong-Goo
Kim, Hansung
Sohn, Kwanghoon
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2122 - 2124
[3] EXPANDING AUDIO-VISUAL SCENE
RICHMOND, JW
AMERICAN JOURNAL OF ORTHODONTICS, 1965, 51 (04): : 298 - &
[4] Improving Semantic Scene Categorization by Exploiting Audio-Visual Features
Zhu, Songhao
Yan, Junchi
Liu, Yuncai
PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON IMAGE AND GRAPHICS (ICIG 2009), 2009, : 435 - 440
[5] Be Everywhere - Hear Everything (BEE): Audio Scene Reconstruction by Sparse Audio-Visual Samples
Chen, Mingfei
Su, Kun
Shlizerman, Eli
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 7819 - 7828
[6] Scene recognition with audio-visual sensor fusion
Devicharan, D
Mehrotra, KG
Mohan, CK
Varshney, PK
Zuo, L
Multisensor, Multisource Information Fusion: Architectures, Algorithms and Applications 2005, 2005, 5813 : 201 - 210
[7] Audio-visual technology for conversation scene analysis
Otsuka, Kazuhiro
Araki, Shoko
NTT Technical Review, 2009, 7 (02):
[8] Detection of documentary scene changes by audio-visual fusion
Velivelli, A
Ngo, CW
Huang, TS
IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2003, 2728 : 227 - 237
[9] Effect of Acoustic Scene Complexity and Visual Scene Representation on Auditory Perception in Virtual Audio-Visual Environments
Fichna, Stefan
Biberger, Thomas
Seeber, Bernhard U.
Ewert, Stephan D.
2021 IMMERSIVE AND 3D AUDIO: FROM ARCHITECTURE TO AUTOMOTIVE (I3DA), 2021,
[10] Scene reconstruction from multiple cameras
Szeliski, R
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2000, : 13 - 16

← 1 2 3 4 5 →