Detecting Persian speaker-independent voice commands based on LSTM and ontology in communicating with the smart home appliances

被引:0
|
作者
Kalkhoran, Leila Safarpoor [1 ]
Tabibian, Shima [1 ]
Homayounvala, Elaheh [2 ]
机构
[1] Shahid Beheshti Univ, Cyberspace Res Inst, Tehran, Iran
[2] London Metropolitan Univ, Sch Comp & Digital Media, London, England
关键词
Voice commands detection; Ontology; Smart home appliances; Long short-term memory; Accessibility; SYSTEM;
D O I
10.1007/s10462-022-10326-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, various interfaces are used to control smart home appliances. The human and smart home appliances interaction may be based on input devices such as a mouse, keyboard, microphone, or webcam. The interaction between humans and machines can be established via speech using a microphone as one of the input modes. The Speech-based human and machine interaction is a more natural way of communication in comparison to other types of interfaces. Existing speech-based interfaces in the smart home domain suffer from some problems such as limiting the users to use a fixed set of pre-defined commands, not supporting indirect commands, requiring a large training set, or depending on some specific speakers. To solve these challenges, we proposed several approaches in this paper. We exploited ontology as a knowledge base to support indirect commands and remove user restrictions on expressing a specific set of commands. Moreover, Long Short-Term Memory (LSTM) has been exploited for detecting spoken commands more accurately. Additionally, due to the lack of Persian voice commands for interacting with smart home appliances, a dataset of speaker-independent Persian voice commands for communicating with TV, media player, and lighting system has been designed, recorded, and evaluated in this research. The experimental results show that the LSTM-based voice command detection system performed almost 1.5% and 13% more accurately than the Hidden Markov Model-based one, in scenarios 'with' and 'without ontology', respectively. Furthermore, using ontology in the LSTM-based method has improved the system performance by about 40%.
引用
收藏
页码:6039 / 6060
页数:22
相关论文
共 12 条
  • [1] Detecting Persian speaker-independent voice commands based on LSTM and ontology in communicating with the smart home appliances
    Leila Safarpoor Kalkhoran
    Shima Tabibian
    Elaheh Homayounvala
    Artificial Intelligence Review, 2023, 56 : 6039 - 6060
  • [2] Speaker-independent recognition of isolated voice commands using auditory models
    Kolokolev, AS
    Yakhno, VP
    AUTOMATION AND REMOTE CONTROL, 1995, 56 (08) : 1176 - 1182
  • [3] Voice Conversion Based Augmentation and a Hybrid CNN-LSTM Model for Improving Speaker-Independent Keyword Recognition on Limited Datasets
    Wubet, Yeshanew Ale
    Lian, Kuang-Yow
    IEEE ACCESS, 2022, 10 : 89170 - 89180
  • [4] Speaker-independent HMM-based Voice Conversion Using Quantized Fundamental Frequency
    Nose, Takashi
    Kobayashi, Takao
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1724 - 1727
  • [5] Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency
    Nose, Takashi
    Kobayashi, Takao
    SPEECH COMMUNICATION, 2011, 53 (07) : 973 - 985
  • [6] Speaker-independent expressive voice synthesis using learning-based hybrid network model
    Vekkot, Susmitha
    Gupta, Deepa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (03) : 597 - 613
  • [7] Tone correctness improvement in speaker-independent average-voice-based Thai speech synthesis
    Chomphan, Suphattharachal
    Kobayashi, Takao
    SPEECH COMMUNICATION, 2009, 51 (04) : 330 - 343
  • [8] Speaker-independent expressive voice synthesis using learning-based hybrid network model
    Susmitha Vekkot
    Deepa Gupta
    International Journal of Speech Technology, 2020, 23 : 597 - 613
  • [9] A Speech Recognition Algorithm of Speaker-Independent Chinese Isolated Words Based on RNN-LSTM and Attention Mechanism
    Hao, Qiuyun
    Wang, FuQiang
    Ma, XiaoFeng
    Zhang, Peng
    2021 14TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2021), 2021,
  • [10] Improving the accuracy of Persian HMM-based Voice Command Detection System in Smart Homes Based on Ontology Method
    Kalkhoran, Leila Safarpoor
    Tabibian, Shima
    Homayounvala, Elaheh
    2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,