A Mid-Level Representation for Melody-Based Retrieval in Audio Collections

被引:25
|
作者
Marolt, Matija [1 ]
机构
[1] Univ Ljubljana, Fac Comp & Informat Sci, Ljubljana 1000, Slovenia
关键词
Audio collections; information retrieval; melody; music;
D O I
10.1109/TMM.2008.2007293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Searching audio collections using high-level musical descriptors is a difficult problem, due to the lack of reliable methods for extracting melody, harmony, rhythm, and other such descriptors from unstructured audio signals. In this paper, we present a novel approach to melody-based retrieval in audio collections. Our approach supports audio, as well as symbolic queries and ranks results according to melodic similarity to the query. We introduce a beat-synchronous melodic representation consisting of salient melodic lines, which are extracted from the analyzed audio signal. We propose the use of a 2-D shift-invariant transform to extract shift-invariant melodic fragments from the melodic representation and demonstrate how such fragments can be indexed and stored in a song database. An efficient search algorithm based on locality-sensitive hashing is used to perform retrieval according to similarity of melodic fragments. On the cover song detection task, good results are achieved for audio, as well as for symbolic queries, while fast retrieval performance makes the proposed system suitable for retrieval in large databases.
引用
收藏
页码:1617 / 1625
页数:9
相关论文
共 50 条
  • [31] Image Classification Using Mixed-Order Structural Representation based on Mid-Level Feature
    Jiang, Bing
    Song, Yan
    Dai, Li-Rong
    2013 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2013, : 144 - 149
  • [32] Indoor Scene Classification Based on Mid-Level Features
    Zhang, Qiang
    Yang, Jinfu
    Zhang, Shanshan
    INFORMATION TECHNOLOGY AND INTELLIGENT TRANSPORTATION SYSTEMS, VOL 1, 2017, 454 : 235 - 242
  • [33] SSNet: Learning Mid-Level Image Representation Using Salient Superpixel Network
    Ji, Zhihang
    Wang, Fan
    Gao, Xiang
    Xu, Lijuan
    Hu, Xiaopeng
    APPLIED SCIENCES-BASEL, 2020, 10 (01):
  • [34] WINDOW MINING BY CLUSTERING MID-LEVEL REPRESENTATION FOR WEAKLY SUPERVISED OBJECT DETECTION
    Wang, Chong
    Ren, Weiqiang
    Huang, Kaiqi
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 4067 - 4071
  • [35] Quality assessment for view synthesis using low-level and mid-level structural representation
    Zhou, Yu
    Li, Leida
    Ling, Suiyi
    Le Callet, Patrick
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2019, 74 : 309 - 321
  • [36] Learning explicit video attributes from mid-level representation for video captioning
    Nian, Fudong
    Li, Teng
    Wang, Yan
    Wu, Xinyu
    Ni, Bingbing
    Xu, Changsheng
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 163 : 126 - 138
  • [37] Discriminative body part interaction mining for mid-level action representation and classification
    Roy, Abhinaba
    Banerjee, Biplab
    Murino, Vittorio
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 : 829 - 840
  • [38] Deep sparse representation-based mid-level visual elements discovery in fine-grained classification
    Le Lv
    Dongbin Zhao
    Kun Shao
    Soft Computing, 2019, 23 : 8711 - 8722
  • [39] Deep sparse representation-based mid-level visual elements discovery in fine-grained classification
    Lv, Le
    Zhao, Dongbin
    Shao, Kun
    SOFT COMPUTING, 2019, 23 (18) : 8711 - 8722
  • [40] A machine vision based pistachio sorting using transferred mid-level image representation of Convolutional Neural Network
    Farazi, Mohammad
    Abbas-Zadeh, Mohammad Javad
    Moradi, Hadi
    2017 10TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2017, : 145 - 148