Multi-step Coding Structure of Spatial Audio Object Coding

被引:1
|
作者
Hu, Chenhao [1 ,2 ]
Hu, Ruimin [1 ,2 ]
Wang, Xiaochen [1 ,2 ]
Wu, Tingzhao [1 ,2 ]
Li, Dengshi [1 ,3 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Wuhan, Peoples R China
[2] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
[3] Jianghan Univ, Sch Math & Comp, Wuhan, Peoples R China
来源
基金
国家重点研发计划;
关键词
Audio object coding; Residual coding; Spatial audio;
D O I
10.1007/978-3-030-37731-1_54
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spatial audio object coding (SAOC) is an effective method which compresses multiple audio objects and provides flexibility for personalized rendering in interactive services. It divides each frame signal into 28 sub-bands and extracts one set object spatial parameters for each sub-band. Objects can be coded into a downmix signal and a few parameters by this way. However, using same parameters in one sub-band will cause frequency aliasing distortion, which seriously impacts listening experience. Existing studies to improve SAOC cannot guarantee that all audio objects can be decoded well. This paper describes a new multi-step object coding structure to efficient calculate residual of each object as additional side information to compensate the aliasing distortion of each object. In this multi-step structure, a sorting strategy based on sub-band energy of each object is proposed to determine which audio object should be encoded in each step, because the object encoding order will affect the final decoded quality. The singular value decomposition (SVD) is used to reduce the increasing bit-rate due to the added side information. From the experiment results, the performance of proposed method is better than SAOC and SAOC-TSC, and each object can be decoded well with respect to the bit-rate and the sound quality.
引用
收藏
页码:666 / 678
页数:13
相关论文
共 50 条
  • [31] Multichannel matching pursuit and applications to spatial audio coding
    Goodwin, Michael M.
    2006 FORTIETH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-5, 2006, : 1114 - 1118
  • [32] Multi-step ahead time series forecasting via sparse coding and dictionary based techniques
    Helmi, Ahmed
    Fakhr, Mohamed W.
    Atiya, Amir F.
    APPLIED SOFT COMPUTING, 2018, 69 : 464 - 474
  • [33] Audio object coding based on optimal parameter frequency resolution
    Wu, Tingzhao
    Hu, Ruimin
    Wang, Xiaochen
    Ke, Shanfa
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (15) : 20723 - 20738
  • [34] Parameter Domain Loudness Estimation in Parametric Audio Object Coding
    Paulus, Jouni
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2469 - 2473
  • [35] MULTICHANNEL OBJECT-BASED AUDIO CODING WITH CONTROLLABLE QUALITY
    Gorlow, Stanislaw
    Habets, Emanuel A. P.
    Marchand, Sylvain
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 561 - 565
  • [36] Audio object coding based on optimal parameter frequency resolution
    Tingzhao Wu
    Ruimin Hu
    Xiaochen Wang
    Shanfa Ke
    Multimedia Tools and Applications, 2019, 78 : 20723 - 20738
  • [37] Distortion Reduction via CAE and DenseNet Mixture Network for Low Bitrate Spatial Audio Object Coding
    Wu, Yulin
    Hu, Ruimin
    Wang, Xiaochen
    Hu, Chenhao
    Ke, Shanfa
    IEEE MULTIMEDIA, 2022, 29 (01) : 55 - 64
  • [38] Joint speech/audio coding based scalable perceptual audio coding
    Gao, Li
    Hu, Ruimin
    Yang, Yuhong
    2014 IEEE/ACIS 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2014, : 419 - 424
  • [39] Audio coding: From broadcast standard(s) to advanced audio coding
    Noll, Peter
    IT - Information Technology, 1999, 41 (01): : 12 - 18
  • [40] Encoding Multichannel Audio for Ultra HDTV Based on Spatial Audio Coding with Optimization
    Elfitri, Ikhwana
    Nursyam, Doni
    Fitrilina
    Kurnia, Rahmadi
    2018 IEEE REGION TEN SYMPOSIUM (TENSYMP), 2018, : 140 - 144