Combining Data Augmentations for CNN-Based Voice Command Recognition

被引:7
|
作者
Azarang, Arian [1 ]
Hansen, John [1 ]
Kehtarnavaz, Nasser [1 ]
机构
[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA
关键词
Combining data augmentation methods for voice command recognition; CNN-based voice command recognition; voice command human interaction systems; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1109/hsi47298.2019.8942638
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents combining two data augmentation methods involving speed perturbation and room impulse response reverberation for the purpose of improving the generalization capability of convolutional neural networks when used for voice command recognition. Speed perturbation generates voice command variations caused by shorter or longer time durations of commands spoken by different speakers. Room impulse response reverberation generates voice command variations caused by reflected sound paths. The combination of these two augmentation methods is presented in this paper by examining a public domain dataset of voice commands. The experimental results based on the performance metric of word error rate indicate the improvement in voice command recognition rates when combining these data augmentation methods relative to using each augmentation method individually.
引用
收藏
页码:17 / 21
页数:5
相关论文
共 50 条
  • [31] A Machine Learning Based Command Voice Recognition Interface
    Arias-Otalora, Daniel-S.
    Florez, Andres
    Mellizo, Gerson
    Rodriguez-Garavito, C. H.
    Romero, E.
    Tumialan, J. A.
    APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2022, 2022, 1685 : 450 - 460
  • [32] Robust voice command recognition based on SDCN algorithm
    Tao, Shiyan
    Liu, Chongqing
    He, Xin
    Gu, Liang
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2000, 34 (07): : 889 - 891
  • [33] Electromyography (EMG)-Based Chinese Voice Command Recognition
    Lyu, Ming
    Xiong, Caihua
    Zhang, Qin
    2014 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION (ICIA), 2014, : 926 - 931
  • [34] CNN-based architecture recognition and contour standardization based on aerial images
    Deng, Yi
    Xie, Xiaodan
    Xing, Chengyue
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (03): : 2119 - 2127
  • [35] CNN-based Methods for Offline Arabic Handwriting Recognition: A Review
    El Khayati, Mohsine
    Kich, Ismail
    Taouil, Youssef
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [36] CNN-based architecture recognition and contour standardization based on aerial images
    Yi Deng
    Xiaodan Xie
    Chengyue Xing
    Neural Computing and Applications, 2023, 35 : 2119 - 2127
  • [37] CNN-Based drug recognition and braille embosser system for the blind
    Lee S.
    Jung S.
    Song H.
    Journal of Computing Science and Engineering, 2018, 12 (04) : 149 - 156
  • [38] CNN-Based Audio Front End Processing on Speech Recognition
    Fan, Ruchao
    Liu, Gang
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 349 - 354
  • [39] CNN-based Methods for Offline Arabic Handwriting Recognition: A Review
    Mohsine El Khayati
    Ismail Kich
    Youssef Taouil
    Neural Processing Letters, 56
  • [40] Incorporation of Extra Pseudo Labels for CNN-based Gait Recognition
    Muramatsu, Daigo
    Moriwaki, Kousuke
    Maruya, Yoshiki
    Takemura, Noriko
    Yagi, Yasushi
    PROCEEDINGS OF THE 21ST 2022 INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG 2022), 2022, P-329