Graph modeling for vocal melody extraction from polyphonic music

被引:1
|
作者
Zhang, Weiwei [1 ]
Yan, Lingyu [1 ]
Zhang, Qiaoling [2 ]
Gao, Jinyi [1 ]
机构
[1] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China
[2] Zhejiang Sci Tech Univ, Sch Informat & Elect, Hangzhou 310018, Peoples R China
关键词
Vocal melody extraction; Graph modeling; Graph convolutional network; Shift-invariant graph structure; AUDIO;
D O I
10.1016/j.apacoust.2023.109491
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a vocal melody extraction method based on graph modeling is proposed. First, constant-Q transform of mixed audio signal is applied. Then, amplitude spectra of several adjacent frames are concatenated together to construct the input feature. Afterwards, an undirected graph is constructed to model the melody extraction issue, and the frame-wise melodic pitches are estimated by a graph convolutional network (GCN), where the pitch estimation issue is regarded as a multi-class classification problem. The frequency bins are viewed as nodes and the underlying connection relationships of the frequency bins are defined as edges. Finally, the quantized frame-wise pitches are fine-tuned according to the salience function defined at a certain range of the smoothed melody trajectory based on the pitches estimated by GCN. The proposed method addresses the vocal melody extraction issue in an explainable way where the edges are defined according to the underlying connection relationships of different frequency bins. Experimental results demonstrate that the proposed method achieves good performance with light weight parameters.& COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Novel melody line identification algorithm for polyphonic MIDI music
    Velusamy, Sudha
    Thoshkahna, Balaji
    Ramakrishnan, K. R.
    ADVANCES IN MULTIMEDIA MODELING, PT 2, 2007, 4352 : 248 - +
  • [32] MUSICAL GENRE CLASSIFICATION USING MELODY FEATURES EXTRACTED FROM POLYPHONIC MUSIC SIGNALS
    Salamon, Justin
    Rocha, Bruno
    Gomez, Emilia
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 81 - 84
  • [33] FREQUENCY-ANCHORED DEEP NETWORKS FOR POLYPHONIC MELODY EXTRACTION
    Sharma, Aman Kumar
    Saxena, Kavya Ranjan
    Arora, Vipul
    2021 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2021, : 452 - 456
  • [34] Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals
    Durrieu, Jean-Louis
    Richard, Gael
    David, Bertrand
    Fevotte, Cedric
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03): : 564 - 575
  • [35] On-Line Melody Extraction From Polyphonic Audio Using Harmonic Cluster Tracking
    Arora, Vipul
    Behera, Laxmidhar
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (03): : 520 - 530
  • [36] Transcription of Polyphonic Vocal Music with a Repetitive Melodic Structure
    Bohak, Ciril
    Marolt, Matija
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2016, 64 (09): : 664 - 672
  • [37] Enhancing Vocal Melody Extraction With Multilevel Contexts
    Wang, Xian
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1304 - 1308
  • [38] On the use of U-Net for dominant melody estimation in polyphonic music
    Doras, Guillaume
    Esling, Philippe
    Peeters, Geoffroy
    2019 INTERNATIONAL WORKSHOP ON MULTILAYER MUSIC REPRESENTATION AND PROCESSING (MMRP 2019), 2019, : 66 - 70
  • [39] Extraction and remixing of drum tracks from polyphonic music signals
    Gillet, O
    Richard, G
    2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2005, : 315 - 318
  • [40] DRUM LOOP PATTERN EXTRACTION FROM POLYPHONIC MUSIC AUDIO
    Zhu, Yongwei
    Tan, Hui Li
    Rahardja, Susanto
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 482 - 485