A Mid-Level Representation for Melody-Based Retrieval in Audio Collections

被引:25
|
作者
Marolt, Matija [1 ]
机构
[1] Univ Ljubljana, Fac Comp & Informat Sci, Ljubljana 1000, Slovenia
关键词
Audio collections; information retrieval; melody; music;
D O I
10.1109/TMM.2008.2007293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Searching audio collections using high-level musical descriptors is a difficult problem, due to the lack of reliable methods for extracting melody, harmony, rhythm, and other such descriptors from unstructured audio signals. In this paper, we present a novel approach to melody-based retrieval in audio collections. Our approach supports audio, as well as symbolic queries and ranks results according to melodic similarity to the query. We introduce a beat-synchronous melodic representation consisting of salient melodic lines, which are extracted from the analyzed audio signal. We propose the use of a 2-D shift-invariant transform to extract shift-invariant melodic fragments from the melodic representation and demonstrate how such fragments can be indexed and stored in a song database. An efficient search algorithm based on locality-sensitive hashing is used to perform retrieval according to similarity of melodic fragments. On the cover song detection task, good results are achieved for audio, as well as for symbolic queries, while fast retrieval performance makes the proposed system suitable for retrieval in large databases.
引用
收藏
页码:1617 / 1625
页数:9
相关论文
共 50 条
  • [1] Melody-based retrieval of music
    Jin, Y
    Huang, M
    ELECTRONIC LIBRARY, 2004, 22 (03): : 269 - 273
  • [2] A Musically Motivated Mid-Level Representation for Pitch Estimation and Musical Audio Source Separation
    Durrieu, Jean-Louis
    David, Bertrand
    Richard, Gael
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (06) : 1180 - 1191
  • [3] SuperFloxels: A Mid-level Representation for Video Sequences
    Ravichandran, Avinash
    Wang, Chaohui
    Raptis, Michalis
    Soatto, Stefano
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7585 : 131 - 140
  • [4] Learning part-based mid-level representation for visual recognition
    Yuan, Baodi
    Tu, Jian
    Zhao, Rui-Wei
    Zheng, Yingbin
    Jiang, Yu-Gang
    NEUROCOMPUTING, 2018, 275 : 2126 - 2136
  • [5] Learning Contour-Based Mid-Level Representation for Shape Classification
    Yang, Chengzhuan
    Fang, Lincong
    Wei, Hui
    IEEE ACCESS, 2020, 8 (08): : 157587 - 157601
  • [6] Group Sparse-Based Mid-Level Representation for Action Recognition
    Zhang, Shiwei
    Gao, Changxin
    Chen, Feifei
    Luo, Sihui
    Sang, Nong
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2017, 47 (04): : 660 - 672
  • [7] Ensemble representation of animacy could be based on mid-level visual features
    Natalia A. Tiurina
    Yuri A. Markov
    Attention, Perception, & Psychophysics, 2025, 87 (2) : 415 - 430
  • [8] A generic mid-level representation for semantic video analysis
    Tang, Q
    Lim, JH
    Jin, JS
    Sun, HP
    Tian, Q
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 629 - 632
  • [9] Modal-Dependent Retrieval Based on Mid-Level Semantic Enhancement Space
    Zheng, Shunxin
    Zhang, Huaxiang
    Qi, Yudan
    Zhang, Bin
    IEEE ACCESS, 2019, 7 : 49906 - 49917
  • [10] A Mid-Level Representation of Visual Structures for Video Compression
    Georgiadis, Georgios
    Soatto, Stefano
    2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,