Deep Learning Approaches for Sung Vowel Classification

被引:0
|
作者
Carlson, Parker [1 ,2 ]
Donnelly, Patrick J. [2 ]
机构
[1] UC Santa Barbara, Santa Barbara, CA 93106 USA
[2] Oregon State Univ, Corvallis, OR 97331 USA
关键词
Sung Vowels; Phoneme Classification; Raw Audio; Automatic Speech Recognition; CNN; LSTM; Transformer; VocalSet; FORMANT; FEATURES;
D O I
10.1007/978-3-031-56992-0_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phoneme classification is an important part of automatic speech recognition systems. However, attempting to classify phonemes during singing has been significantly less studied. In this work, we investigate sung vowel classification, a subset of the phoneme classification problem. Many prior approaches that attempt to classify spoken or sung vowels rely upon spectral feature extraction, such as formants or Melfrequency cepstral coefficients. We explore classifying sung vowels with deep neural networks trained directly on raw audio. Using VocalSet, a singing voice dataset performed by professional singers, we compare three neural models and two spectral models for classifying five sung Italian vowels performed in a variety of vocal techniques. We find that our neural models achieved accuracies between 68.4% and 79.6%, whereas our spectral models failed to discern vowels. Of the neural models, we find that a fine-tuned transformer performed the strongest; however, a convolutional or recurrent model may provide satisfactory results in resource-limited scenarios. This result implies that neural approaches trained directly on raw audio, without extracting spectral features, are viable approaches for singing phoneme classification and deserve further exploration.
引用
收藏
页码:67 / 83
页数:17
相关论文
共 50 条
  • [1] An Investigation on Deep Learning Approaches for Diatoms classification
    Carcagni, Pierluigi
    da Silva Junior, Andouglas Goncalves
    Memmolo, Pasquale
    Bianco, Vittorio
    Merola, Francesco
    Garcia Goncalves, Luiz Marcos
    Ferraro, Pietro
    Distante, Cosimo
    MULTIMODAL SENSING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS II, 2021, 11785
  • [2] Comparison of Deep Learning Approaches for Sentiment Classification
    Kalaivani, K. S.
    Uma, S.
    Kanimozhiselvi, C. S.
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 1043 - 1047
  • [3] Deep Learning Approaches for Image Classification Techniques
    Guan, Youyou
    Han, Yuxuan
    Liu, Siqi
    2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 1132 - 1136
  • [4] Machine and Deep Learning Approaches for IoT Attack Classification
    Nascita, Alfredo
    Cerasuolo, Francesco
    Di Monda, Davide
    Garcia, Jonas Thern Aberia
    Montieri, Antonio
    Pescape, Antonio
    IEEE INFOCOM 2022 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS (INFOCOM WKSHPS), 2022,
  • [5] Question Classification in Albanian Through Deep Learning Approaches
    Trandafili, Evis
    Kote, Nelda
    Plepi, Gjergj
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (03) : 737 - 744
  • [6] Classification of Skin Lesion Images with Deep Learning Approaches
    Bayram, Buket
    Kulavuz, Bahadir
    Ertugrul, Berkay
    Bayram, Bulent
    Bakirman, Tolga
    Cakar, Tuna
    Dogan, Metehan
    BALTIC JOURNAL OF MODERN COMPUTING, 2022, 10 (02): : 241 - 250
  • [7] Deep Learning Approaches towards Book Covers Classification
    Buczkowski, Przemyslaw
    Sobkowicz, Antoni
    Kozlowski, Marek
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 309 - 316
  • [8] Evaluation of different deep learning approaches for EEG classification
    Scharnagl, Bastian
    Groth, Christian
    2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 42 - 47
  • [9] Statistical and Deep Learning Approaches for Literary Genre Classification
    Goyal, Anshaj
    Prakash, V. Prem
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 297 - 305
  • [10] Comparison of Deep Learning approaches in classification of lacial landforms
    Nadachowski, Pawel
    Lubniewski, Zbigniew
    Trzcinska, Karolina
    Tegowski, Jaroslaw
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2024, 70 (04) : 823 - 829