Training audio transformers for cover song identification

被引:0
|
作者
Te Zeng
Francis C. M. Lau
机构
[1] The University of Hong Kong,Department of Computer Science
关键词
Cover song identification; Transformer; Music representation learning;
D O I
暂无
中图分类号
学科分类号
摘要
In the past decades, convolutional neural networks (CNNs) have been commonly adopted in audio perception tasks, which aim to learn latent representations. However, for audio analysis, CNNs may exhibit limitations in effectively modeling temporal contextual information. Analogous to the successes of transformer architecture used in the fields of computer vision and audio classification, to capture long-range global contexts better, we here extend this line of work and propose an Audio Similarity Transformer (ASimT), a convolution-free, purely transformer network-based architecture for learning effective representations of audio signals. Furthermore, we introduce a novel loss MAPLoss, used in tandem with classification loss, to directly enhance the mean average precision. In the experiments, ASimT demonstrates its state-of-the-art performance in cover song identification on public datasets.
引用
收藏
相关论文
共 50 条
  • [21] CoverHunter: Cover Song Identification with Refined Attention and Alignments
    Liu, Feng
    Tuo, Deyi
    Xu, Yinan
    Han, Xintong
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1080 - 1085
  • [22] Time complexity evaluation of cover song identification algorithms
    Ferreira, Martha Dais
    de Mello, Rodrigo Fernandes
    Applied Acoustics, 2021, 175
  • [23] Time complexity evaluation of cover song identification algorithms
    Ferreira, Martha Dais
    de Mello, Rodrigo Fernandes
    APPLIED ACOUSTICS, 2021, 175
  • [24] EFFECTIVE COVER SONG IDENTIFICATION BASED ON SKIPPING BIGRAMS
    Xu, Xiaoshuo
    Chen, Xiaoou
    Yang, Deshun
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 96 - 100
  • [25] Cochlear pitch class profile for cover song identification
    Chen, Ning
    Downie, J. Stephen
    Xiao, Hai-dong
    Zhu, Yu
    APPLIED ACOUSTICS, 2015, 99 : 92 - 96
  • [26] Enhanced Feature Summarizing for Effective Cover Song Identification
    Hu, Jingyi
    Chen, Ning
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2113 - 2126
  • [27] Improved similarity fusion scheme for cover song identification
    Fan, Yanlan
    Chen, Ning
    ELECTRONICS LETTERS, 2018, 54 (24) : 1403 - 1404
  • [28] Audio frequency transformers
    Thomson, JM
    PROCEEDINGS OF THE INSTITUTE OF RADIO ENGINEERS, 1927, 15 (08): : 679 - 686
  • [29] COVER SONG IDENTIFICATION USING SONG-TO-SONG CROSS-SIMILARITY MATRIX WITH CONVOLUTIONAL NEURAL NETWORK
    Lee, Juheon
    Chang, Sungkyun
    Choe, Sang Keun
    Lee, Kyogu
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 396 - 400
  • [30] DisCover: Disentangled Music Representation Learning for Cover Song Identification
    Xun, Jiahao
    Zhang, Shengyu
    Yang, Yanting
    Zhu, Jieming
    Deng, Liqun
    Zhao, Zhou
    Dong, Zhenhua
    Li, Ruiqi
    Zhang, Lichao
    Wu, Fei
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 453 - 463