Application of Neural Architecture Search to Instrument Recognition in Polyphonic Audio

被引:3
|
作者
Fricke, Leonard [1 ]
Vatolkin, Igor [1 ]
Ostermann, Fabian [1 ]
机构
[1] TU Dortmund Univ, Dept Comp Sci, Dortmund, Germany
关键词
Neural Architecture Search; Instrument Recognition; Music Information Retrieval; Hyperband Search; Bayesian Optimization;
D O I
10.1007/978-3-031-29956-8_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instrument recognition in polyphonic audio signals is a very challenging classification task. It helps to improve related application scenarios, like music transcription and recommendation, organization of large music collections, or analysis of historical trends and properties of musical styles. Recently, the classification performance could be improved by the integration of deep convolutional neural networks. However, in to date published studies, the network architectures and parameter settings were usually adopted from image recognition tasks and manually adjusted, without a systematic optimization. In this paper, we show how two different neural architecture search strategies can be successfully applied for improvement of the prediction of nine instrument classes, significantly outperforming the classification performance of three fixed baseline architectures from previous works. Although high computing efforts for model optimization are required, the training of the final architecture is done only once for later prediction of instruments in a possibly unlimited number of musical tracks.
引用
收藏
页码:117 / 131
页数:15
相关论文
共 50 条
  • [21] Teacher Guided Neural Architecture Search for Face Recognition
    Wang, Xiaobo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 2817 - 2825
  • [22] Binarized Neural Architecture Search for Efficient Object Recognition
    Chen, Hanlin
    Zhuo, Li'an
    Zhang, Baochang
    Zheng, Xiawu
    Liu, Jianzhuang
    Ji, Rongrong
    Doermann, David
    Guo, Guodong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (02) : 501 - 516
  • [23] Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music Using Discrete Wavelet Transform
    Dash, Sukanta Kumar
    Solanki, S. S.
    Chakraborty, Soubhik
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (7) : 4239 - 4271
  • [24] IMPROVING INSTRUMENT RECOGNITION IN POLYPHONIC MUSIC THROUGH SYSTEM INTEGRATION
    Giannoulis, Dimitrios
    Benetos, Emmanouil
    Klapuri, Anssi
    Plumbley, Mark D.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [25] Conformer Space Neural Architecture Search for Multi-Task Audio Separation
    Lu, Shun
    Wang, Yang
    Yao, Peng
    Li, Chenxing
    Tan, Jianchao
    Deng, Feng
    Wang, Xiaorui
    Song, Chengru
    INTERSPEECH 2022, 2022, : 5358 - 5362
  • [26] LPI Radar Waveform Recognition Based on Neural Architecture Search
    Ma, Zhiyuan
    Yu, Wenting
    Zhang, Peng
    Huang, Zhi
    Lin, Anni
    Xia, Yan
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [27] Cross task neural architecture search for EEG signal recognition
    Duan, Yiqun
    Wang, Zhen
    Li, Yi
    Tang, Jianhang
    Wang, Yu-Kai
    Lin, Chin-Teng
    NEUROCOMPUTING, 2023, 545
  • [28] A New Deep Neural Architecture Search Pipeline for Face Recognition
    Zhu, Ning
    Yu, Zekuan
    Kou, Caixia
    IEEE ACCESS, 2020, 8 : 91303 - 91310
  • [29] RECOGNITION OF HARMONIC SOUNDS IN POLYPHONIC AUDIO USING A MISSING FEATURE APPROACH
    Giannoulis, Dimitrios
    Klapuri, Anssi
    Plumbley, Mark D.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8658 - 8662
  • [30] A Neural Network Architecture for Children's Audio-Visual Emotion Recognition
    Matveev, Anton
    Matveev, Yuri
    Frolova, Olga
    Nikolaev, Aleksandr
    Lyakso, Elena
    MATHEMATICS, 2023, 11 (22)