AUDIO CODING BASED ON SPECTRAL RECOVERY BY CONVOLUTIONAL NEURAL NETWORK

被引:0
|
作者
Shin, Seong-Hyeon [1 ]
Beack, Seung Kwon [2 ]
Lee, Taejin [2 ]
Park, Hochong [1 ]
机构
[1] Kwangwoon Univ, Seoul, South Korea
[2] Elect & Telecommun Res Inst, Daejeon, South Korea
关键词
audio coding; convolutional neural network; spectral recovery; transform coding; SPEECH;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This study proposes a new method of audio coding based on spectral recovery, which can enhance the performance of transform audio coding. An encoder represents spectral information of an input in a time-frequency domain and transmits only a portion of it so that the remaining spectral information can be recovered based on the transmitted information. A decoder recovers the magnitudes of missing spectral information using a convolutional neural network. The signs of missing spectral information are either transmitted or randomly assigned, according to their importance. By combining transmission and recovery of spectral information, the proposed method can enhance the coding performance, compared with conventional transform coding. The subjective performance evaluation shows that, for mono coding at 39.4 kbps, the proposed method provides higher sound quality than the USAC, by an average MUSHRA score of 8.5.
引用
收藏
页码:725 / 729
页数:5
相关论文
共 50 条
  • [1] Spectral reflectance recovery using convolutional neural network
    Xiong, Yifan
    Wu, Guangyuan
    Li, Xiaozhou
    Niu, Shijun
    Han, Xiaomeng
    [J]. INTERNATIONAL CONFERENCE ON OPTOELECTRONIC MATERIALS AND DEVICES (ICOMD 2021), 2022, 12164
  • [2] Convolutional Neural Network based Audio Event Classification
    Lim, Minkyu
    Lee, Donghyun
    Park, Hosung
    Kang, Yoseb
    Oh, Junseok
    Park, Jeong-Sik
    Jang, Gil-Jin
    Kim, Ji-Hwan
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (06): : 2748 - 2760
  • [3] Audio Steganalysis With Convolutional Neural Network
    Chen, Bolin
    Luo, Weiqi
    Li, Haodong
    [J]. IH&MMSEC'17: PROCEEDINGS OF THE 2017 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, 2017, : 85 - 90
  • [4] Identification of Audio Processing Operations Based on Convolutional Neural Network
    Chen, Bolin
    Luo, Weiqi
    Luo, Da
    [J]. PROCEEDINGS OF THE 6TH ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY (IH&MMSEC'18), 2018, : 73 - 77
  • [5] A Convolutional Neural Network for Pixelwise Illuminant Recovery in Colour and Spectral Images
    Robles-Kelly, Antonio
    Wei, Ran
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 109 - 114
  • [6] Audio Steganalysis with Improved Convolutional Neural Network
    Lin, Yuzhen
    Wang, Rangding
    Yan, Diqun
    Dong, Li
    Zhang, Xueyuan
    [J]. IH&MMSEC '19: PROCEEDINGS OF THE ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, 2019, : 210 - 215
  • [7] Highly Efficient Audio Coding With Blind Spectral Recovery Based on Machine Learning
    Kim, Jae-Won
    Beack, Seung Kwon
    Lim, Wootaek
    Park, Hochong
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1212 - 1216
  • [8] HEVC Intra Frame Coding Based on Convolutional Neural Network
    Yeh, Chia-Hung
    Zhang, Zheng-Teng
    Chen, Mei-Juan
    Lin, Chih-Yang
    [J]. IEEE ACCESS, 2018, 6 : 50087 - 50095
  • [9] Audio Feature Extraction and Classification Technology Based on Convolutional Neural Network
    Liu, Zhenfang
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (09) : 1425 - 1431
  • [10] Recognition of Audio Depression Based on Convolutional Neural Network and Generative Antagonism Network Model
    Wang, Zhiyong
    Chen, Longxi
    Wang, Lifeng
    Diao, Guangqiang
    [J]. IEEE ACCESS, 2020, 8 : 101181 - 101191