Combining Acoustic and Multilevel Visual Features for Music Genre Classification

被引:15
|
作者
Wu, Ming-Ju [1 ]
Jang, Jyh-Shing R. [2 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, Hsinchu 30013, Taiwan
[2] Natl Taiwan Univ, Dept Comp Sci & Informat Engn, Taipei 10617, Taiwan
关键词
Algorithms; Performance; Music genre classification; PATTERNS;
D O I
10.1145/2801127
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most music genre classification approaches extract acoustic features from frames to capture timbre information, leading to the common framework of bag-of-frames analysis. However, time-frequency analysis is also vital for modeling music genres. This article proposes multilevel visual features for extracting spectrogram textures and their temporal variations. A confidence-based late fusion is proposed for combining the acoustic and visual features. The experimental results indicated that the proposed method achieved an accuracy improvement of approximately 14% and 2% in the world's largest benchmark dataset (MASD) and Unique dataset, respectively. In particular, the proposed approach won the Music Information Retrieval Evaluation eXchange (MIREX) music genre classification contests from 2011 to 2013, demonstrating the feasibility and necessity of combining acoustic and visual features for classifying music genres.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 50 条
  • [1] Combining visual and acoustic features for music genre classification
    Nanni, Loris
    Costa, Yandre M. G.
    Lumini, Alessandra
    Kim, Moo Young
    Baek, Seung Ryul
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2016, 45 : 108 - 117
  • [2] Ensemble of deep learning, visual and acoustic features for music genre classification
    Nanni, Loris
    Costa, Yandre M. G.
    Aguiar, Rafael L.
    Silla, Carlos N., Jr.
    Brahnam, Sheryl
    [J]. JOURNAL OF NEW MUSIC RESEARCH, 2018, 47 (04) : 383 - 397
  • [3] Music genre classification using visual features with feature selection
    Zottesso, Rafael H. D.
    Costa, Yandre M. G.
    Bertolini, Diego
    [J]. PROCEEDINGS OF THE 2016 35TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2016,
  • [4] Combining visual and acoustic features for audio classification tasks
    Nanni, L.
    Costa, Y. M. G.
    Lucio, D. R.
    Silla, C. N., Jr.
    Brahnam, S.
    [J]. PATTERN RECOGNITION LETTERS, 2017, 88 : 49 - 56
  • [5] Combining visual and acoustic features for bird species classification
    Nanni, Loris
    Costa, Yandre M. G.
    Lucio, Diego R.
    Silla, Carlos N., Jr.
    Brahnam, Sheryl
    [J]. 2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 396 - 401
  • [6] Music genre classification based on auditory image, spectral and acoustic features
    Xin Cai
    Hongjuan Zhang
    [J]. Multimedia Systems, 2022, 28 : 779 - 791
  • [7] Music genre classification based on auditory image, spectral and acoustic features
    Cai, Xin
    Zhang, Hongjuan
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (03) : 779 - 791
  • [8] Music Mood Classification Using Visual and Acoustic Features
    Chagas Tavares, Juliano Cezar
    da Costa, Yandre Maldonado e Gomes
    [J]. 2017 XLIII LATIN AMERICAN COMPUTER CONFERENCE (CLEI), 2017,
  • [9] The Analysis and Comparison of Vital Acoustic Features in Content -Based Classification of Music Genre
    Wang, Zhe
    Xia, Jingbo
    Luo, Bin
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA), 2013, : 404 - 408
  • [10] Optimal classification of music genres based on acoustic and visual features
    Kumaraswamy, Balachandra
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (23):