Perceptual Audio Object Coding Using Adaptive Subband Grouping with CNN and Residual Block

被引:1
|
作者
Wu, Yulin [1 ]
Hu, Ruimin [1 ]
Wang, Xiaochen [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
关键词
spatial audio object coding (SAOC); perceptual coding; adaptive subband grouping; aliasing distortion;
D O I
10.1109/ICME55011.2023.00433
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spatial audio content is becoming increasingly popular and is regarded as a set of object signals with associated metadata. The object-based content representation is independent of loudspeaker layouts and provides high spatial resolution when reproduced on more loudspeakers. The audio quality of the traditional spatial audio object coding (SAOC) method has severe aliasing distortion, which impairs the immersive listening experience. In this study, we reduce aliasing distortion by perceptual adaptive subband grouping strategy and use the convolutional neural network (CNN) and residual block to build the side information compressing model. Both objective and subjective experiments on benchmark datasets with different bitrates show that the proposed method achieves favorable performance against state-of-the-art methods.
引用
收藏
页码:2543 / 2548
页数:6
相关论文
共 50 条
  • [21] Audio object coding based on N-step residual compensating
    Chenhao Hu
    Xiaochen Wang
    Ruimin Hu
    Yulin Wu
    Multimedia Tools and Applications, 2021, 80 : 18717 - 18733
  • [22] Audio object coding based on N-step residual compensating
    Hu, Chenhao
    Wang, Xiaochen
    Hu, Ruimin
    Wu, Yulin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (12) : 18717 - 18733
  • [23] Block adaptive CNN/HEVC interframe prediction for video coding
    Jimbo, Satoru
    Wang, Ji
    Yashima, Yoshiyuki
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT) 2019, 2019, 11049
  • [24] Object-Based Image Retrieval using Perceptual Grouping
    Wu, Tian-Luu
    Horng, Ji-Hwei
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 1, PROCEEDINGS, 2008, : 71 - 76
  • [25] Perceptual audio coding using sinusoidal/optimum wavelet representation
    Sathidevi, PS
    Venkataramani, Y
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2002, 21 (05) : 511 - 524
  • [26] Perceptual filter design for audio coding using psychoacoustic modelling
    Univ of Strathclyde, Glasgow, United Kingdom
    Electron Lett, 8 (747-748):
  • [28] Perceptual Audio Coding Using Sinusoidal/Optimum Wavelet Representation
    P.S. Sathidevi
    Y. Venkataramani
    Circuits, Systems and Signal Processing, 2002, 21 : 511 - 524
  • [29] Perceptual filter design for audio coding using psychoacoustic modelling
    Lam, YH
    Stewart, RW
    ELECTRONICS LETTERS, 1998, 34 (08) : 747 - 748
  • [30] Watermarking of audio signals using adaptive subband filtering and Manchester signaling
    Dymarski, P.
    2007 14TH INTERNATIONAL WORKSHOP ON SYSTEMS, SIGNALS, & IMAGE PROCESSING & EURASIP CONFERENCE FOCUSED ON SPEECH & IMAGE PROCESSING, MULTIMEDIA COMMUNICATIONS & SERVICES, 2007, : 423 - 426