Classical and Deep Learning Methods for Speech Command Recognition

被引:2
|
作者
Xie, Jie [1 ]
Li, Qijing [1 ]
Hu, Kai [1 ]
Zhu, Mingying [2 ]
机构
[1] Jiangnan Univ, Sch Internet Things Engn, Wuxi, Jiangsu, Peoples R China
[2] Nanjing Univ, Sch Econ, Wuxi, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
speech command recognition; convolutional neural networks; acoustic feature;
D O I
10.1109/ICICN52636.2021.9673813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an application area of speech command recognition, smart home has provided people a convenient way to communicate with various digital devices. In this study, we aim to investigate both machine learning and deep learning architectures for improved speaker-independent speech command recognition. First, we extract statistical MFCCs vectors to train classical machine learning models: KNN, SVM, and RF. Second, we trained deep learning models using two end-to-end architectures with different inputs. Experimental results indicate that our presented method achieved the highest accuracy and F1 score of 0.846 +/- 0.148 and 0.84 +/- 0.157 on the private dataset.
引用
下载
收藏
页码:41 / 45
页数:5
相关论文
共 50 条
  • [31] Improving speech command recognition through decision-level fusion of deep filtered speech cues
    Mehra, Sunakshi
    Ranga, Virender
    Agarwal, Ritu
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1365 - 1373
  • [32] Improving speech command recognition through decision-level fusion of deep filtered speech cues
    Sunakshi Mehra
    Virender Ranga
    Ritu Agarwal
    Signal, Image and Video Processing, 2024, 18 : 1365 - 1373
  • [33] Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device
    Ameen, Zinah J. Mohammed J.
    Kadhim, Abdulkareem Abdulrahman
    ADVANCES IN HUMAN-COMPUTER INTERACTION, 2023, 2023
  • [34] DISTRIBUTED DEEP LEARNING STRATEGIES FOR AUTOMATIC SPEECH RECOGNITION
    Zhang, Wei
    Cui, Xiaodong
    Finkler, Ulrich
    Kingsbury, Brian
    Saon, George
    Kung, David
    Picheny, Michael
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5706 - 5710
  • [35] Recognition of English speech - using a deep learning algorithm
    Wang, Shuyan
    JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [36] DEEP VARIATIONAL FILTER LEARNING MODELS FOR SPEECH RECOGNITION
    Agrawal, Purvi
    Ganapathy, Sriram
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5731 - 5735
  • [37] Survey of Deep Representation Learning for Speech Emotion Recognition
    Latif, Siddique
    Rana, Rajib
    Khalifa, Sara
    Jurdak, Raja
    Qadir, Junaid
    Schuller, Bjorn
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (02) : 1634 - 1654
  • [38] Deep Learning Analysis Models for Speech and Emotional Recognition
    Wu, Jun
    Zhu, Tianliang
    Yu, Chengtian
    Wang, Chunzhi
    Zhou, Xianjing
    Liu, Hu
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1541 - 1545
  • [39] Deep Learning of Speech Features for Improved Phonetic Recognition
    Lee, Jaehyung
    Lee, Soo-Young
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259
  • [40] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154