Deep Learning Approaches for Sung Vowel Classification

被引:0
|
作者
Carlson, Parker [1 ,2 ]
Donnelly, Patrick J. [2 ]
机构
[1] UC Santa Barbara, Santa Barbara, CA 93106 USA
[2] Oregon State Univ, Corvallis, OR 97331 USA
关键词
Sung Vowels; Phoneme Classification; Raw Audio; Automatic Speech Recognition; CNN; LSTM; Transformer; VocalSet; FORMANT; FEATURES;
D O I
10.1007/978-3-031-56992-0_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phoneme classification is an important part of automatic speech recognition systems. However, attempting to classify phonemes during singing has been significantly less studied. In this work, we investigate sung vowel classification, a subset of the phoneme classification problem. Many prior approaches that attempt to classify spoken or sung vowels rely upon spectral feature extraction, such as formants or Melfrequency cepstral coefficients. We explore classifying sung vowels with deep neural networks trained directly on raw audio. Using VocalSet, a singing voice dataset performed by professional singers, we compare three neural models and two spectral models for classifying five sung Italian vowels performed in a variety of vocal techniques. We find that our neural models achieved accuracies between 68.4% and 79.6%, whereas our spectral models failed to discern vowels. Of the neural models, we find that a fine-tuned transformer performed the strongest; however, a convolutional or recurrent model may provide satisfactory results in resource-limited scenarios. This result implies that neural approaches trained directly on raw audio, without extracting spectral features, are viable approaches for singing phoneme classification and deserve further exploration.
引用
收藏
页码:67 / 83
页数:17
相关论文
共 50 条
  • [31] Investigation of Efficient Approaches and Applications for Image Classification Through Deep Learning
    Khandelwal, Shruti
    Prajapat, Shaligram
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 471 - 487
  • [32] Performance of deep learning approaches for detection and classification of ceramic tile defects
    Sivabalaselvamani, D.
    Nanthini, K.
    Vanithamani, S.
    Nivetha, L.
    JOURNAL OF CERAMIC PROCESSING RESEARCH, 2023, 24 (01): : 78 - 88
  • [33] Fingerprint Classification Based on Deep Learning Approaches: Experimental Findings and Comparisons
    Militello, Carmelo
    Rundo, Leonardo
    Vitabile, Salvatore
    Conti, Vincenzo
    SYMMETRY-BASEL, 2021, 13 (05):
  • [34] Attention Aware Deep Learning Approaches for an Efficient Stress Classification Model
    Zulqarnain, Muhammad
    Shah, Habib
    Ghazali, Rozaida
    Alqahtani, Omar
    Sheikh, Rubab
    Asadullah, Muhammad
    BRAIN SCIENCES, 2023, 13 (07)
  • [35] A Survey of Deep Learning and Traditional Approaches for EEG Signal Processing and Classification
    Iftikhar, Memoona
    Khan, Shoab Ahmad
    Hassan, Ali
    2018 IEEE 9TH ANNUAL INFORMATION TECHNOLOGY, ELECTRONICS AND MOBILE COMMUNICATION CONFERENCE (IEMCON), 2018, : 395 - 400
  • [36] A study of deep learning approaches for classification and detection chromosomes in metaphase images
    Andrade, Maria F. S.
    Dias, Lucas V.
    Macario, Valmir
    Lima, Fabiana F.
    Hwang, Suy F.
    Silva, Julio C. G.
    Cordeiro, Filipe R.
    MACHINE VISION AND APPLICATIONS, 2020, 31 (7-8)
  • [37] ECG Classification for Detecting ECG Arrhythmia Empowered with Deep Learning Approaches
    Rahman, Atta-Ur
    Asif, Rizwana Naz
    Sultan, Kiran
    Alsaif, Suleiman Ali
    Abbas, Sagheer
    Khan, Muhammad Adnan
    Mosavi, Amir
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [38] Heart Sounds Classification Using Frequency Features with Deep Learning Approaches
    Blitti, Kokou Elvis Khorem
    Tola, Fitsum Getachew
    Wangdi, Pema
    Kumar, Dinesh
    Diwan, Anjali
    2024 IEEE APPLIED SENSING CONFERENCE, APSCON, 2024,
  • [39] Non-Audible Speech Classification Using Deep Learning Approaches
    Fernandes, Rommel
    Huang, Lei
    Vejarano, Gustavo
    2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 630 - 634
  • [40] Deep Learning Approaches to Osteosarcoma Diagnosis and Classification: A Comparative Methodological Approach
    Vezakis, Ioannis A.
    Lambrou, George I.
    Matsopoulos, George K.
    CANCERS, 2023, 15 (08)