Predominant audio source separation in polyphonic music

被引:0
|
作者
Lekshmi Chandrika Reghunath
Rajeev Rajan
机构
[1] College of Engineering Trivandrum,Department of Electronics and Communication Engineering
[2] APJ Abdul Kalam Technological University,Department of Electronics and Communication Engineering
[3] Government Engineering College Barton Hill,undefined
[4] APJ Abdul Kalam Technological University,undefined
关键词
Predominant; Spectrogram; Time-frequency filtering; Generative adversarial network; Binary masking;
D O I
暂无
中图分类号
学科分类号
摘要
Predominant source separation is the separation of one or more desired predominant signals, such as voice or leading instruments, from polyphonic music. The proposed work uses time-frequency filtering on predominant source separation and conditional adversarial networks to improve the perceived quality of isolated sounds. The pitch tracks corresponding to the prominent sound sources of the polyphonic music are estimated using a predominant pitch extraction algorithm and a binary mask corresponding to each pitch track and its harmonics are generated. Time-frequency filtering is performed on the spectrogram of the input signal using a binary mask that isolates the dominant sources based on pitch. The perceptual quality of source-separated music signal is enhanced using a CycleGAN-based conditional adversarial network operating on spectrogram images. The proposed work is systematically evaluated using the IRMAS and ADC 2004 datasets. Subjective and objective evaluations have been carried out. The reconstructed spectrogram is converted back to music signals by applying the inverse short-time Fourier transform. The intelligibility of separated audio is enhanced using an intelligibility enhancement module based on an audio style transfer scheme. The performance of the proposed method is compared with state-of-the-art Demucs and Wave-U-Net architectures and shows competing performance both objectively and subjectively.
引用
收藏
相关论文
共 50 条
  • [21] FUSING TRANSCRIPTION RESULTS FROM POLYPHONIC AND MONOPHONIC AUDIO FOR SINGING MELODY TRANSCRIPTION IN POLYPHONIC MUSIC
    Zhu, Bilei
    Wu, Fuzhang
    Li, Ke
    Wu, Yongjian
    Huang, Feiyue
    Wu, Yunsheng
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 296 - 300
  • [22] Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music
    Han, Yoonchang
    Kim, Jaehun
    Lee, Kyogu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 208 - 221
  • [23] Predominant instrument recognition from polyphonic music using feature fusion
    Ajayakumar, Roshni
    Rajan, Rajeev
    EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY FOR SOCIETY, ENERGY AND ENVIRONMENT, 2018, : 721 - 726
  • [24] Audio source separation
    Davies, M
    MATHEMATICS IN SIGNAL PROCESSING V, 2002, (71): : 57 - 68
  • [25] Polyphonic music separation based on the simplified energy splitter
    Aczél, Kristóf
    Vajk, István
    WSEAS Transactions on Signal Processing, 2008, 4 (04): : 201 - 210
  • [26] Voice Separation in Polyphonic Music: Information Theory Approach
    Della Ventura, Michele
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 638 - 646
  • [27] Transcription and separation of drum signals from polyphonic music
    Gillet, Olivier
    Richard, Gael
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03): : 529 - 540
  • [28] Soundprism: An Online System for Score-Informed Source Separation of Music Audio
    Duan, Zhiyao
    Pardo, Bryan
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (06) : 1205 - 1215
  • [29] The Influence of Blind Source Separation on Mixed Audio Speech and Music Emotion Recognition
    Laugs, Casper
    Koops, Hendrik Vincent
    Odijk, Daan
    Kaya, Heysem
    Volk, Anja
    COMPANION PUBLICATON OF THE 2020 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION (ICMI '20 COMPANION), 2020, : 67 - 71
  • [30] Singing Voice Separation and Pitch Extraction from Monaural Polyphonic Audio Music Via DNN and Adaptive Pitch Tracking
    Fan, Zhe-Cheng
    Jang, Jyh-Shing Roger
    Lu, Chung-Li
    2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 178 - 185