Audio Compensation Network: to Improve the Quality of Low-Energy Audio in Visual Sound Separation

被引:0
|
作者
Gao, Yining [1 ]
Zhang, Pengyuan [1 ]
Tian, Zejia [1 ]
机构
[1] Wuhan Res Inst Posts & Telecommun, Wuhan, Hubei, Peoples R China
关键词
cross-modal learning; audio separation; audio compensation; NONNEGATIVE MATRIX FACTORIZATION;
D O I
10.1109/ICCECE51280.2021.9342383
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Separating the single audio from the mixed audio has always been a task that researchers are trying to achieve. In visual audio, visual information can be used as an aid to audio separation, helping us to separate audio better. However, under the current method, the independent sound separated from the mixed video has the phenomenon that high energy audio causes serious interference to the loss energy audio. We call it "over-complete separation" and "incomplete separation", which seriously affects our separation. This paper proposes a new separation network, which can compensate fur the kiss of low-energy audio in the separation process, improve the signal-to-noise ratio and loudness of loo-energy audio, and achieve better results than state-of-the-art methods on our dataset.
引用
收藏
页码:727 / 732
页数:6
相关论文
共 50 条
  • [1] Compensation methods of sound quality for a car audio equalizer
    Ozawa, K
    Tomita, T
    Shiba, A
    Ise, T
    Suzuki, Y
    2005 IEEE Networking, Sensing and Control Proceedings, 2005, : 311 - 316
  • [2] Active Audio-Visual Separation of Dynamic Sound Sources
    Majumder, Sagnik
    Grauman, Kristen
    COMPUTER VISION, ECCV 2022, PT XXXIX, 2022, 13699 : 551 - 569
  • [3] iQuery: Instruments as Queries for Audio-Visual Sound Separation
    Chen, Jiaben
    Zhang, Renrui
    Lian, Dongze
    Yang, Jiaqi
    Zeng, Ziyao
    Shi, Jianbo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14675 - 14686
  • [4] Audio-visual sound separation via hidden Markov models
    Hershey, J
    Casey, M
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1173 - 1180
  • [5] TIME-DOMAIN AUDIO-VISUAL SPEECH SEPARATION ON LOW QUALITY VIDEOS
    Wu, Yifei
    Li, Chenda
    Bai, Jinfeng
    Wu, Zhongqin
    Qian, Yanmin
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 256 - 260
  • [6] Audio-Visual Grouping Network for Sound Localization from Mixtures
    Mo, Shentong
    Tian, Yapeng
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10565 - 10574
  • [7] Off-Screen Sound Separation Based on Audio-visual Pre-training Using Binaural Audio
    Yoshida, Masaki
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    SENSORS, 2023, 23 (09)
  • [8] Visually Guided Sound Source Separation With Audio-Visual Predictive Coding
    Song, Zengjie
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 15
  • [9] Audio-Visual Salieny Network with Audio Attention Module
    Cheng, Shuaiyang
    Gao, Xing
    Song, Liang
    Xiahou, Jianbing
    PROCEEDINGS OF 2021 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS '21), 2021,
  • [10] The effects of audio capacitors on sound quality
    Dodds, Paul
    ELECTRONICS WORLD, 2008, 114 (1865): : 36 - 38