Voice command recognition in intelligent systems using deep neural networks

被引:0
|
作者
Sokolov, Artem [1 ]
Savchenko, Andrey V. [2 ]
机构
[1] Natl Res Univ Higher Sch Econ, Nizhnii Novgorod, Russia
[2] Natl Res Univ Higher Sch Econ, Lab Algorithms & Technol Network Anal, Nizhnii Novgorod, Russia
关键词
Automatic speech recognition; autonomous man-machine systems; deep neural networks; voice command recognition; non-native speech;
D O I
10.1109/sami.2019.8782755
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending of the grammar in order to filter unknown commands. As a result, the grammar is resistant to distortions and unexpected words near or inside of command. We implemented the proposed approach using Finite State Transducers in the Kaldi framework and examined it using self-recorded noised data with various level of signal-to-noise ratio. We compared recognition accuracy and average decision-making time of our approach with the state-of-the-art continuous speech recognition engines based on language models. It was experimentally shown that our approach is characterized by up to 60% higher accuracy than conventional offline speech recognition methods based on language models. The speed of utterance recognition is 3 times higher than speed of traditional continuous speech recognition algorithms.
引用
收藏
页码:113 / 116
页数:4
相关论文
共 50 条
  • [1] Voice Command Recognition for Drone Control by Deep Neural Networks on Embedded System
    Yapicioglu, Cengizhan
    Dokur, Zumray
    Olmez, Tamer
    2021 8TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2021), 2021, : 65 - 72
  • [2] Convolutional Neural Networks for Image Recognition in Mixed Reality Using Voice Command Labeling
    Hoppenstedt, Burkhard
    Kammerer, Klaus
    Reichert, Manfred
    Spiliopoulou, Myra
    Pryss, Ruediger
    AUGMENTED REALITY, VIRTUAL REALITY, AND COMPUTER GRAPHICS (AVR 2019), PT II, 2019, 11614 : 63 - 70
  • [3] Voice recognition using neural networks
    Venayagamoorthy, Ganesh K.
    Moonasar, Viresh
    Sandrasegaran, Kumbes
    Proceedings of the South African Symposium on Communications and Signal Processing, COMSIG, 1998, : 29 - 32
  • [4] Voice recognition using neural networks
    Venayagamoorthy, GK
    Moonasar, V
    Sandrasegaran, K
    PROCEEDINGS OF THE 1998 SOUTH AFRICAN SYMPOSIUM ON COMMUNICATIONS AND SIGNAL PROCESSING: COMSIG '98, 1998, : 29 - 32
  • [5] Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks
    Sharan, Roneel V.
    Berkovsky, Shlomo
    Liu, Sidong
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 998 - 1001
  • [6] Voice Recognition Technology Using Neural Networks
    Zaatri, Abdelouahab
    Azzizi, Norelhouda
    Rahmani, Fouad Lazhar
    JOURNAL OF NEW TECHNOLOGY AND MATERIALS, 2015, 5 (01) : 27 - 31
  • [7] An intelligent license plate detection and recognition model using deep neural networks
    Onesimu J.A.
    D.sebastian R.
    Sei Y.
    Christopher L.
    Annals of Emerging Technologies in Computing, 2021, 5 (04) : 23 - 36
  • [8] Vietnamese Speech Command Recognition using Recurrent Neural Networks
    Phan Duy Hung
    Truong Minh Giang
    Le Hoang Nam
    Phan Minh Duong
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (07) : 194 - 201
  • [9] Neural networks for voice recognition
    Gemello, R
    AEI AUTOMAZIONE ENERGIA INFORMAZIONE, 1996, 83 (10): : 57 - 62
  • [10] RECOGNITION OF SPOOFED VOICE USING CONVOLUTIONAL NEURAL NETWORKS
    Liang, Huixin
    Lin, Xiaodan
    Zhang, Qiong
    Kang, Xiangui
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 293 - 297