Temporal Pyramid Pooling Convolutional Neural Network for Cover Song Identification

被引:0
|
作者
Yu, Zhesong [1 ]
Xu, Xiaoshuo [1 ]
Chen, Xiaoou [1 ]
Yang, Deshun [1 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cover song identification is an important problem in the field of Music Information Retrieval. Most existing methods rely on hand-crafted features and sequence alignment methods, and further breakthrough is hard to achieve. In this paper, Convolutional Neural Networks (CNNs) are used for representation learning toward this task. We show that they could be naturally adapted to deal with key transposition in cover songs. Additionally, Temporal Pyramid Pooling is utilized to extract information on different scales and transform songs with different lengths into fixed-dimensional representations. Furthermore, a training scheme is designed to enhance the robustness of our model. Extensive experiments demonstrate that combined with these techniques, our approach is robust against musical variations existing in cover songs and outperforms state-of-the-art methods on several datasets with low time complexity.
引用
收藏
页码:4846 / 4852
页数:7
相关论文
共 50 条
  • [1] Temporal Pyramid Pooling-Based Convolutional Neural Network for Action Recognition
    Wang, Peng
    Cao, Yuanzhouhan
    Shen, Chunhua
    Liu, Lingqiao
    Shen, Heng Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (12) : 2613 - 2622
  • [2] LEARNING A REPRESENTATION FOR COVER SONG IDENTIFICATION USING CONVOLUTIONAL NEURAL NETWORK
    Yu, Zhesong
    Xu, Xiaoshuo
    Chen, Xiaoou
    Yang, Deshun
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 541 - 545
  • [3] COVER SONG IDENTIFICATION USING SONG-TO-SONG CROSS-SIMILARITY MATRIX WITH CONVOLUTIONAL NEURAL NETWORK
    Lee, Juheon
    Chang, Sungkyun
    Choe, Sang Keun
    Lee, Kyogu
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 396 - 400
  • [4] A Spatial Pyramid Pooling Convolutional Neural Network for Smoky Vehicle Detection
    Cao, Yichao
    Lu, Chang
    Lu, Xiaobo
    Xia, Xue
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9170 - 9175
  • [5] Convolutional neural network with spatial pyramid pooling for hand gesture recognition
    Tan, Yong Soon
    Lim, Kian Ming
    Tee, Connie
    Lee, Chin Poo
    Low, Cheng Yaw
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10): : 5339 - 5351
  • [6] Convolutional neural network with spatial pyramid pooling for hand gesture recognition
    Yong Soon Tan
    Kian Ming Lim
    Connie Tee
    Chin Poo Lee
    Cheng Yaw Low
    Neural Computing and Applications, 2021, 33 : 5339 - 5351
  • [7] KEY-INVARIANT CONVOLUTIONAL NEURAL NETWORK TOWARD EFFICIENT COVER SONG IDENTIFICATION
    Xu, Xiaoshuo
    Chen, Xiaoou
    Yang, Deshun
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [8] DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling
    Yee, Pui Sin
    Lim, Kian Ming
    Lee, Chin Poo
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
  • [9] Manchu Word Recognition Based on Convolutional Neural Network with Spatial Pyramid Pooling
    Li, Min
    Zheng, Ruirui
    Xu, Shuang
    Fu, Yu
    Huang, Di
    2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
  • [10] Pyramid Pooling Dense Convolutional Neural Network for Multi-focus image Fusion
    Li, Yi
    Shen, Xuanjing
    Chen, Haipeng
    PROCEEDINGS OF 2019 6TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2019, : 164 - 168