Perceptual Audio Object Coding Using Adaptive Subband Grouping with CNN and Residual Block

被引:1
|
作者
Wu, Yulin [1 ]
Hu, Ruimin [1 ]
Wang, Xiaochen [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
关键词
spatial audio object coding (SAOC); perceptual coding; adaptive subband grouping; aliasing distortion;
D O I
10.1109/ICME55011.2023.00433
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spatial audio content is becoming increasingly popular and is regarded as a set of object signals with associated metadata. The object-based content representation is independent of loudspeaker layouts and provides high spatial resolution when reproduced on more loudspeakers. The audio quality of the traditional spatial audio object coding (SAOC) method has severe aliasing distortion, which impairs the immersive listening experience. In this study, we reduce aliasing distortion by perceptual adaptive subband grouping strategy and use the convolutional neural network (CNN) and residual block to build the side information compressing model. Both objective and subjective experiments on benchmark datasets with different bitrates show that the proposed method achieves favorable performance against state-of-the-art methods.
引用
收藏
页码:2543 / 2548
页数:6
相关论文
共 50 条
  • [41] Adaptive and dynamic multi-grouping scheme for absolute moment block truncation coding
    Zhaoyang Xiang
    Yu-Chen Hu
    Heng Yao
    Chuan Qin
    Multimedia Tools and Applications, 2019, 78 : 7895 - 7909
  • [42] Adaptive and dynamic multi-grouping scheme for absolute moment block truncation coding
    Xiang, Zhaoyang
    Hu, Yu-Chen
    Yao, Heng
    Qin, Chuan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (07) : 7895 - 7909
  • [43] DCT-based intraframe coding using block grouping algorithm
    Weng, SK
    Tsai, JC
    Hsieh, CH
    Jou, YD
    PIMRC'96 - THE SEVENTH IEEE INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PROCEEDINGS, VOLS 1-3, 1996, : 991 - 994
  • [44] Object tracking using adaptive block matching
    Hariharakrishan, K
    Schonfeld, D
    Raffy, P
    Yassa, F
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 65 - 68
  • [45] SPARSE DECOMPOSITION OF AUDIO SIGNALS USING A PERCEPTUAL MEASURE OF DISTORTION. APPLICATION TO LOSSY AUDIO CODING
    Toumi, Ichrak
    Derrien, Olivier
    DAFX-15: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, 2015, : 168 - 173
  • [46] Perceptual audio coding using N-channel lattice vector quantization
    Ostergaard, Jan
    Niamut, Omar
    Jensen, Jesper
    Heusdens, Richard
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5055 - 5058
  • [47] Adaptive subband video coding using bivariate generalized gaussian distribution model
    Coban, MZ
    Mersereau, RM
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 1990 - 1993
  • [48] Adaptive watermarking using successive subband quantization and perceptual model based on multiwavelet transform
    Kwon, KR
    Tewfik, AH
    SECURITY AND WATERMARKING OF MULTIMEDIA CONTENTS IV, 2002, 4675 : 334 - 348
  • [49] HIGH COMPRESSION IMAGE-CODING USING AN ADAPTIVE MORPHOLOGICAL SUBBAND DECOMPOSITION
    EGGER, O
    LI, W
    KUNT, M
    PROCEEDINGS OF THE IEEE, 1995, 83 (02) : 272 - 287
  • [50] On object-oriented video coding using the CNN universal machine
    Stoffels, A
    Roska, T
    Chua, LO
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-FUNDAMENTAL THEORY AND APPLICATIONS, 1996, 43 (11): : 948 - 952