Perceptual Audio Object Coding Using Adaptive Subband Grouping with CNN and Residual Block

被引:1
|
作者
Wu, Yulin [1 ]
Hu, Ruimin [1 ]
Wang, Xiaochen [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
关键词
spatial audio object coding (SAOC); perceptual coding; adaptive subband grouping; aliasing distortion;
D O I
10.1109/ICME55011.2023.00433
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spatial audio content is becoming increasingly popular and is regarded as a set of object signals with associated metadata. The object-based content representation is independent of loudspeaker layouts and provides high spatial resolution when reproduced on more loudspeakers. The audio quality of the traditional spatial audio object coding (SAOC) method has severe aliasing distortion, which impairs the immersive listening experience. In this study, we reduce aliasing distortion by perceptual adaptive subband grouping strategy and use the convolutional neural network (CNN) and residual block to build the side information compressing model. Both objective and subjective experiments on benchmark datasets with different bitrates show that the proposed method achieves favorable performance against state-of-the-art methods.
引用
收藏
页码:2543 / 2548
页数:6
相关论文
共 50 条
  • [31] WAVEFORM APPROXIMATING RESIDUAL AUDIO CODING WITH PERCEPTUAL PRE- AND POST-FILTERING
    Nielsen, Jesper Kjoer
    Jensen, Jesper Rindom
    Christensen, Mads Groesboll
    Jensen, Soren Holdt
    Larsen, Torben
    2008 42ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-4, 2008, : 1255 - 1259
  • [32] Subband image coding using block-zero tree coding and vector quantization
    Park, SH
    Moon, HJ
    Nasrabadi, NM
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 2052 - 2055
  • [33] Subband audio coding using a perceptually hybrid vector-scalar quantization
    Yu, RS
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 827 - 830
  • [34] Image subband coding using fuzzy inference and adaptive quantization
    Hsieh, MS
    Tseng, DC
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (03): : 509 - 513
  • [35] Audio coding with signal adaptive block based filter bank switching
    Saleem, M.
    Ali, M. T.
    2007 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, 2007, : 435 - +
  • [36] Automatic Object Segmentation Using Perceptual Grouping of Regions with Contextual Constraints
    Zand, Mohsen
    Doraisamy, Shyamala
    Halin, Alfian Abdul
    Mustaffa, Mas Rina
    5TH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, THEORY, TOOLS AND APPLICATIONS 2015, 2015, : 530 - 534
  • [38] Perceptual bit-patterns based on partial-order allocation schemes with application to subband speech and audio coding
    Ramprashad, Sean A.
    Bae, Soo Hyun
    CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1-5, 2007, : 1491 - 1498
  • [39] Location Coding of Tool-Object Pairs Based on Perceptual Grouping: Evidence from Object-Based Correspondence Effect
    Shaikh, Usman jawed
    Binkofski, Ferdinand
    Pellicano, Antonello
    JOURNAL OF COGNITION, 2025, 8 (01): : 1 - 16
  • [40] Multiple Block-Size Transform Video Coding Using a Subband Structure
    Chen, Ting-Chung
    Fleischer, Paul E.
    Tzou, Kou-Hu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 1991, 1 (01) : 59 - +