A new fingerprint definition for effective song recognition

被引:5
|
作者
Serrano, Salvatore [1 ]
Sahbudin, Murtadha Arif Bin [1 ]
Chaouch, Chakib [1 ]
Scarpa, Marco [1 ]
机构
[1] Univ Messina, Dept Engn, Messina, Italy
关键词
Song recognition; Audio fingerprint; Power spectral density; Hamming distance; Binary fingerprints; ROBUST;
D O I
10.1016/j.patrec.2022.06.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Music and song recognition is an activity of wide interest for researchers and companies due to the intrinsic challenges and the possible economic profits it can give. Despite basic algorithms about song recognition are simple in principle, it is quite difficult to obtain an efficient and robust approach able to generate an effective algorithm for identifying songs on the fly. This statement is proved by the fact that there are very few companies in the world having their core business into this field, even if the potential market is very huge. In this paper, we propose a new approach for generating fingerprints from excerpts of songs that is the first step in implementing a complete algorithm of song recognition. Their gener-ation is based on the Welch's method for spectral density estimation, the use of a Mel filter bank and an exponential adaptive threshold curve in the frequency domain never used before. Even if the previous techniques are not new, at the best of our knowledge they are not used all together for fingerprint gener-ation. Our main purpose is to show that the proposed fingerprint generation approach permits to obtain a very high accuracy in recognizing pieces of song and their position inside the song, as well as it ap-pears robust compared to typical alteration of the audio signal. Specifically, the fingerprints we generate are highly insensitive to noise and audio lossy compression algorithms; moreover, we think the method is prone also to generate pitch insensitive fingerprints with a small modification. We show through an experimentation with a large database of songs the recognition accuracy obtained with our fingerprints is better than the landmark-based approach (already used by the famous Shazam application). This is not a negligible results because even small improvements means a very large number of more recognitions, with higher profit prospects in industrial applications. In order to better focus on the fingerprint structure and its generation algorithm, we don't discuss any specific search algorithm, that is a subject of further work, and we use a linear search only in our experiments; in such a way, we think the goodness of the fingerprint as such is better evinced.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:135 / 141
页数:7
相关论文
共 50 条
  • [21] Effective fingerprint recognition technique using doped yttrium aluminate nano phosphor material
    Darshan, G. P.
    Premkumar, H. B.
    Nagabhushana, H.
    Sharma, S. C.
    Prashanth, S. C.
    Prasad, B. Daruka
    JOURNAL OF COLLOID AND INTERFACE SCIENCE, 2016, 464 : 206 - 218
  • [22] A new framework for the definition and recognition of free form features
    Langerak, T. R.
    Vergeest, J. S. M.
    JOURNAL OF ENGINEERING DESIGN, 2007, 18 (05) : 489 - 504
  • [23] ON THE NEW DEFINITION OF OFF-SHELL EFFECTIVE ACTION
    FRADKIN, ES
    TSEYTLIN, AA
    NUCLEAR PHYSICS B, 1984, 234 (02) : 509 - 523
  • [24] An effective latent fingerprint enhancement and recognition system using dictionary learning and LCPnet mechanisms
    Rani, R. Jhansi
    Vasanth, K.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 6093 - 6108
  • [25] An Effective New Algorithm for Face Recognition
    Kong Rui
    Zhang Bing
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENT COMMUNICATION, 2015, 16 : 390 - 393
  • [26] SONG RECOGNITION AND SONG PATTERN IN A SHORTHORNED GRASSHOPPER
    SKOVMAND, O
    PEDERSEN, SB
    JOURNAL OF COMPARATIVE PHYSIOLOGY, 1983, 153 (03): : 393 - 401
  • [27] DEFINITION IN BIOLOGY - THE CASE OF BIRD SONG
    SPECTOR, DA
    JOURNAL OF THEORETICAL BIOLOGY, 1994, 168 (04) : 373 - 381
  • [28] New definition and new calculation method of effective average cloud amount
    Li, Zuoyong
    Peng, Lihong
    Wang, Jiayang
    Xiong, Jianqiu
    ATMOSPHERIC ENVIRONMENT, 2006, 40 (24) : 4491 - 4500
  • [29] VOICE RECOGNITION FOR A SONG
    CRABB, D
    BYTE, 1990, 15 (08): : 174 - &
  • [30] 'PROFANE SONG RECOGNITION'
    FERENCZ, G
    NEW HUNGARIAN QUARTERLY, 1988, 29 (109): : 42 - 43