Ensemble of deep learning, visual and acoustic features for music genre classification

被引:23
|
作者
Nanni, Loris [1 ]
Costa, Yandre M. G. [2 ]
Aguiar, Rafael L. [2 ,3 ]
Silla, Carlos N., Jr. [3 ]
Brahnam, Sheryl [4 ]
机构
[1] Univ Padua, Padua, Italy
[2] State Univ Maringa UEM, Maringa, Parana, Brazil
[3] Pontifical Catholic Univ Parana PUCPR, PPGIa, Curitiba, Parana, Brazil
[4] Missouri State Univ, Springfield, MO USA
关键词
Audio classification; texture; image processing; acoustic features; ensemble of classifiers; machine learning;
D O I
10.1080/09298215.2018.1438476
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this work, we present an ensemble for automated music genre classification that fuses acoustic and visual (both handcrafted and nonhandcrafted) features extracted from audio files. These features are evaluated, compared and fused in a final ensemble shown to produce better classification accuracy than other state-of-the-art approaches on the Latin Music Database, ISMIR 2004, and the GTZAN genre collection. To the best of our knowledge, this paper reports the largest test comparing the combination of different descriptors (including a wavelet convolutional scattering network, which has been tested here for the first time as an input for texture descriptors) and different matrix representations. Superior performance is obtained without ad hoc parameter optimisation; that is to say, the same ensemble of classifiers and parameter settings are used on all tested data-sets. To demonstrate generalisability, our approach is also assessed on the tasks of bird species recognition using vocalisation and whale detection data-sets. All MATLAB source code is available.
引用
收藏
页码:383 / 397
页数:15
相关论文
共 50 条
  • [1] Combining visual and acoustic features for music genre classification
    Nanni, Loris
    Costa, Yandre M. G.
    Lumini, Alessandra
    Kim, Moo Young
    Baek, Seung Ryul
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 : 108 - 117
  • [2] Combining Acoustic and Multilevel Visual Features for Music Genre Classification
    Wu, Ming-Ju
    Jang, Jyh-Shing R.
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2015, 12 (01) : 1 - 17
  • [3] Music Genre Classification Based on Chroma Features and Deep Learning
    Shi, Leisi
    Li, Chen
    Tian, Lihua
    [J]. 2019 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2019, : 81 - 86
  • [4] Robustness of musical features on deep learning models for music genre classification
    Singh, Yeshwant
    Biswas, Anupam
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 199
  • [5] Music Genre Classification Based on Deep Learning
    Zhang, Wenlong
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [6] Music genre classification and music recommendation by using deep learning
    Elbir, A.
    Aydin, N.
    [J]. ELECTRONICS LETTERS, 2020, 56 (12) : 627 - 629
  • [7] Music genre classification using visual features with feature selection
    Zottesso, Rafael H. D.
    Costa, Yandre M. G.
    Bertolini, Diego
    [J]. PROCEEDINGS OF THE 2016 35TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2016,
  • [8] A Music Genre Classification Method Based on Deep Learning
    He, Qi
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [9] Music genre classification based on auditory image, spectral and acoustic features
    Xin Cai
    Hongjuan Zhang
    [J]. Multimedia Systems, 2022, 28 : 779 - 791
  • [10] Music genre classification based on auditory image, spectral and acoustic features
    Cai, Xin
    Zhang, Hongjuan
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (03) : 779 - 791