Efficient, compelling and immersive VR audio experience using Scene Based Audio/Higher Order Ambisonics

被引:0
|
作者
Shivappa, Shankar [1 ]
Morrell, Martin [1 ]
Sen, Deep [1 ]
Peters, Nils [1 ]
Salehin, S. M. Akramus [1 ]
机构
[1] Qualcomm Technol Inc QTI, San Diego, CA 92121 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For a fully immersive and compelling VR experience, the acoustic-illusion of being 'present' in the virtual world must be created. To achieve this illusion two aspects are compulsory: (1) authentic spatial audio production and (2) the need to track and adapt the audio scene to the listener's head position and orientation. This paper shows how Scene-based audio (SBA), often synonymous with Higher Order Ambisonics (HOA), is ideal for VR because its ease of acoustic capture, offline content creation, post-production, transmission and interactive rendering. Compared to object-based audio, the rendering complexity is much lower for SBA. Also, SBA can offer higher and more coherent spatial fidelity when compared to channel based audio. One of the advantages of SBA is flexible rendering, which means that the same audio stream can be rendered to various speaker formats including binaural rendering for headphone consumption. The paper discusses the need for efficient SBA compression for VR content delivery, and presents MPEG-H as an efficient and versatile delivery system for SBA. For a personalized VR experience, accurate binaural rendering is essential. SBA can be efficiently binauralized. Its number of convolutions is proportional to the number of HOA coefficients, rather than proportional to the number of virtual loudspeakers. This means that SBA can render to a high number of virtual loudspeakers without impacting the binauralization computation cost. Furthermore, to improve the spatial perception, SBA binauralization can utilize grids of ideally positioned virtual loudspeakers based on platonic solids or otherwise regularly spaced loudspeaker configurations that are impractical in reality and unsupported in channel-based audio formats. Interactive soundfield rotation in real time is indispensable for creating VR experience. We show how SBA can be rotated and even further enhanced with other user-controlled effects, such as zooming. The paper will discuss use cases to demonstrate the capture, processing, and playback of SBA and will show potential pitfalls and design strategies for an end-to-end spatial audio system for VR. The authors will then conclude that SBA is a robust and compelling audio format for VR, and that SBA can be easily distributed via broadcast or OTT for real-time end consumer use.
引用
收藏
页数:10
相关论文
共 46 条
  • [1] Scene-based Audio Implemented with Higher Order Ambisonics
    Peters N.
    Sen D.
    Kim M.-Y.
    Wuebbolt O.
    Weiss S.M.
    SMPTE Motion Imaging Journal, 2016, 125 (09): : 16 - 24
  • [2] An immersive 3D audio-visual installation based on sound field rendering and reproduction with higher-order ambisonics
    Kaneko, Shoken
    Okumura, Hiraku
    2018 AES INTERNATIONAL CONFERENCE ON SPATIAL REPRODUCTION - AESTHETICS AND SCIENCE, 2018,
  • [3] TRANSLATION OF A HIGHER ORDER AMBISONICS SOUND SCENE BASED ON PARAMETRIC DECOMPOSITION
    Kentgens, Maximilian
    Behler, Andreas
    Fax, Peter
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 151 - 155
  • [4] Analyses on limitations of binaural sound based on the first order Ambisonics for virtual reality audio
    Chang, Ji-Ho
    Cho, Wan-Ho
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2019, 38 (06): : 637 - 650
  • [5] Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras
    Hansung Kim
    Luca Remaggi
    Aloisio Dourado
    Teofilo de Campos
    Philip J. B. Jackson
    Adrian Hilton
    Virtual Reality, 2022, 26 : 823 - 838
  • [6] Efficient Compression and Transportation of Scene Based Audio for Television Broadcast
    Sen, Deep
    Peters, Nils
    Kim, Moo Young
    Morrell, Martin
    2016 AES INTERNATIONAL CONFERENCE ON SOUND FIELD CONTROL, 2016,
  • [7] Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras
    Kim, Hansung
    Remaggi, Luca
    Dourado, Aloisio
    de Campos, Teofilo
    Jackson, Philip J. B.
    Hilton, Adrian
    VIRTUAL REALITY, 2022, 26 (03) : 823 - 838
  • [8] Replicating outdoor environments using VR and ambisonics: a methodology for accurate audio-visual recording, processing and reproduction
    Georgiou, Fotis
    Kawai, Claudia
    Schaffer, Beat
    Pieren, Reto
    VIRTUAL REALITY, 2024, 28 (02)
  • [9] Audio signal restoration using higher-order spectra
    Lau, WH
    Hui, FL
    Leung, SH
    Lee, D
    Luk, A
    ISSPA 96 - FOURTH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 316 - 319
  • [10] Audio Based Violent Scene Classification Using Ensemble Learning
    Sarman, Sercan
    Sert, Mustafa
    2018 6TH INTERNATIONAL SYMPOSIUM ON DIGITAL FORENSIC AND SECURITY (ISDFS), 2018, : 416 - 420