Ensemble of Transformer and Convolutional Recurrent Neural Network for Improving Discrimination Accuracy in Automatic Chord Recognition

被引:0
|
作者
Yamaga, Hikaru [1 ]
Momma, Toshifumi [1 ]
Kojima, Kazunori [1 ]
Itoh, Yoshiaki [1 ]
机构
[1] Iwate Prefectural Univ, Takizawa, Japan
关键词
D O I
10.1109/APSIPAASC58517.2023.10317349
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automatic chord recognition is a task of recognizing and transcribing chords from music data such as popular music. Manual chord transcription requires highly technical knowledge and great effort. A chord is a typical musical feature. Realization of automatic chord recognition can enable their use for many purposes such as musical notation and structural analysis. For this reason, automatic chord recognition has become a major research task in the field of music information retrieval. In recent years, automatic chord recognition has widely used deep learning models. Convolutional Recurrent Neural Network (CRNN) and Transformer have achieved high accuracy. For this study, we focus on the differences in feature extraction approaches used by CRNN and Transformer, and propose an ensemble learning method using the two models. Additionally, we adopt an original overlap inference method to improve their accuracy by complementing the lack of temporal information. Results show that we achieved average accuracy of 78.92% under the traditional evaluation metrics, which are, respectively, 1.64% and 2.43% higher than those of CRNN and Transformer.
引用
收藏
页码:2299 / 2305
页数:7
相关论文
共 50 条
  • [11] Inception recurrent convolutional neural network for object recognition
    Alom, Md Zahangir
    Hasan, Mahmudul
    Yakopcic, Chris
    Taha, Tarek M.
    Asari, Vijayan K.
    MACHINE VISION AND APPLICATIONS, 2021, 32 (01)
  • [12] IMPROVING CONVOLUTIONAL RECURRENT NEURAL NETWORKS FOR SPEECH EMOTION RECOGNITION
    Meyer, Patrick
    Xu, Ziyi
    Fingscheidt, Tim
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 365 - 372
  • [13] Convolutional Neural Network Based Ensemble Approach for Homoglyph Recognition
    Majumder, Md Taksir Hasan
    Rahman, Md Mahabur
    Iqbal, Anindya
    Rahman, M. Sohel
    MATHEMATICAL AND COMPUTATIONAL APPLICATIONS, 2020, 25 (04)
  • [14] Usage of convolutional neural network ensemble for traffic sign recognition
    Kharchenko, Igor I.
    Borovskoy, Igor G.
    Shelmina, Elena A.
    VESTNIK TOMSKOGO GOSUDARSTVENNOGO UNIVERSITETA-UPRAVLENIE VYCHISLITELNAJA TEHNIKA I INFORMATIKA-TOMSK STATE UNIVERSITY JOURNAL OF CONTROL AND COMPUTER SCIENCE, 2022, (61): : 88 - 96
  • [15] Multimodal Emotion Recognition Based on Ensemble Convolutional Neural Network
    Huang, Haiping
    Hu, Zhenchao
    Wang, Wenming
    Wu, Min
    IEEE ACCESS, 2020, 8 : 3265 - 3271
  • [16] Ensemble Learning With Attention-Integrated Convolutional Recurrent Neural Network for Imbalanced Speech Emotion Recognition
    Ai, Xusheng
    Sheng, Victor S.
    Fang, Wei
    Ling, Charles X.
    Li, Chunhua
    IEEE ACCESS, 2020, 8 : 199909 - 199919
  • [17] A convolutional and transformer based deep neural network for automatic modulation classification
    Ying, Shanchuan
    Huang, Sai
    Chang, Shuo
    Yang, Zheng
    Feng, Zhiyong
    Guo, Ningyan
    CHINA COMMUNICATIONS, 2023, 20 (05) : 135 - 147
  • [18] A Convolutional and Transformer Based Deep Neural Network for Automatic Modulation Classification
    Shanchuan Ying
    Sai Huang
    Shuo Chang
    Zheng Yang
    Zhiyong Feng
    Ningyan Guo
    China Communications, 2023, 20 (05) : 135 - 147
  • [19] Research on advertising content recognition based on convolutional neural network and recurrent neural network
    Liu, Xiaomei
    Qi, Fazhi
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2021, 24 (04) : 398 - 404
  • [20] Convolutional Grid Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
    Xue, Jiabin
    Zheng, Tieran
    Han, Jiqing
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 718 - 726