Speech recognition for command entry in multimodal interaction

被引:3
|
作者
Tyfa, DA [1 ]
Howes, M [1 ]
机构
[1] Univ Leeds, Sch Psychol, Leeds LS2 9JT, W Yorkshire, England
基金
英国工程与自然科学研究理事会;
关键词
speech recognition; multiple resources; multimodal interaction; command entry; hands-busy; eyes-busy; verbal interference;
D O I
10.1006/ijhc.1999.0355
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Two experiments investigated the cognitive efficiency of using speech recognition in combination with the mouse and keyboard for a range of word processing tasks. The first experiment examined the potential of this multimodal combination to increase performance by engaging concurrent multiple resources. Speech and mouse responses were compared when using menu and direct (toolbar icon) commands, making for a fairer comparison than in previous research which has been biased against the mouse. Only a limited basis for concurrent resource use was found, with speech leading to poorer task performance with both command types. Task completion times were faster with direct commands for both speech and mouse responses, and direct commands were preferred. In the second experiment, participants were free to choose command type, and nearly always chose to use direct commands with both response modes. Speech performance was again worse than mouse, except for tasks which involved a large amount of hand and eye movement, or where direct speech was used but mouse commands were made via menus. In both experiments recognition errors were low, and although they had some detrimental effect on speech use, problems in combining speech and manual modes were highlighted. Potential verbal interference effects when using speech are discussed. (C) 2000 Academic Press.
引用
收藏
页码:637 / 667
页数:31
相关论文
共 50 条
  • [31] END-TO-END MULTIMODAL SPEECH RECOGNITION
    Palaskar, Shruti
    Sanabria, Ramon
    Metze, Florian
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778
  • [32] WISE: Word-Level Interaction-Based Multimodal Fusion for Speech Emotion Recognition
    Shen, Guang
    Lai, Riwei
    Chen, Rui
    Zhang, Yu
    Zhang, Kejia
    Han, Qilong
    Song, Hongtao
    INTERSPEECH 2020, 2020, : 369 - 373
  • [33] Enhancing smart home interaction through multimodal command disambiguation
    Tommaso Calò
    Luigi De Russis
    Personal and Ubiquitous Computing, 2024, 28 (6) : 985 - 1000
  • [34] Asynchronous Multimodal Text Entry using Speech and Gesture Keyboards
    Kristensson, Per Ola
    Vertanen, Keith
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 588 - +
  • [35] Voice activated command and control with speech recognition over WiFi
    Ayres, T
    Nolan, B
    SCIENCE OF COMPUTER PROGRAMMING, 2006, 59 (1-2) : 109 - 126
  • [36] End-to-End Speech Command Recognition with Capsule Network
    Bae, Jaesung
    Kim, Dae-Shik
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 776 - 780
  • [37] Speech Command Recognition Based on Convolutional Spiking Neural Networks
    Sadovsky, Erik
    Jakubec, Maros
    Jarina, Roman
    2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
  • [38] Vietnamese Speech Command Recognition using Recurrent Neural Networks
    Phan Duy Hung
    Truong Minh Giang
    Le Hoang Nam
    Phan Minh Duong
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (07) : 194 - 201
  • [39] Adversarial Command Detection Using Parallel Speech Recognition Systems
    Cheng, Peng
    Sankar, M. S. Arun
    Bagci, Ibrahim Ethem
    Roedig, Utz
    COMPUTER SECURITY: ESORICS 2021 INTERNATIONAL WORKSHOPS, 2022, 13106 : 238 - 255
  • [40] Efficient and Robust Arabic Automotive Speech Command Recognition System
    Ouali, Soufiyan
    El Garouani, Said
    ALGORITHMS, 2024, 17 (09)