Asynchronous Multimodal Text Entry using Speech and Gesture Keyboards

被引:0
|
作者
Kristensson, Per Ola [1 ]
Vertanen, Keith [2 ]
机构
[1] Univ St Andrews, Sch Comp Sci, St Andrews KY16 9AJ, Fife, Scotland
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ USA
基金
英国工程与自然科学研究理事会;
关键词
mobile text entry; multimodal interfaces;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose reducing errors in text entry by combining speech and gesture keyboard input. We describe a merge model that combines recognition results in an asynchronous and flexible manner. We collected speech and gesture data of users entering both short email sentences and web search queries. By merging recognition results from both modalities, word error rate was reduced by 53% relative for email sentences and 29% relative for web searches. For email utterances with speech errors, we investigated providing gesture keyboard corrections of only the erroneous words. Without the user explicitly indicating the incorrect words, our model was able to reduce the word error rate by 44% relative.
引用
收藏
页码:588 / +
页数:2
相关论文
共 50 条
  • [21] Text Entry in Virtual Environments using Speech and a Midair Keyboard
    Adhikary, Jiban
    Vertanen, Keith
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (05) : 2648 - 2658
  • [22] This Is What's Important - Using Speech and Gesture to Create Focus in Multimodal Utterance
    Freigang, Farina
    Kopp, Stefan
    INTELLIGENT VIRTUAL AGENTS, IVA 2016, 2016, 10011 : 96 - 109
  • [23] Using speech to identify gesture pen strokes in collaborative, multimodal device descriptions
    Herold, James
    Stahovich, Thomas F.
    AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING, 2011, 25 (03): : 237 - 254
  • [24] Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech
    Madoka Miki
    Norihide Kitaoka
    Chiyomi Miyajima
    Takanori Nishino
    Kazuya Takeda
    EURASIP Journal on Audio, Speech, and Music Processing, 2014
  • [25] Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech
    Miki, Madoka
    Kitaoka, Norihide
    Miyajima, Chiyomi
    Nishino, Takanori
    Takeda, Kazuya
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
  • [26] Exploiting speech-gesture correlation in multimodal interaction
    Chen, Fang
    Choi, Eric H. C.
    Wang, Ning
    HUMAN-COMPUTER INTERACTION, PT 3, PROCEEDINGS, 2007, 4552 : 23 - +
  • [27] Multimodal encoding of motion events in speech, gesture and cognition
    Unal, Ercenur
    Mamus, Ezgi
    Ozyurek, Asli
    LANGUAGE AND COGNITION, 2024, 16 (04) : 785 - 804
  • [28] Text Entry in Immersive Head-Mounted Display-Based Virtual Reality Using Standard Keyboards
    Grubert, Jens
    Witzani, Lukas
    Ofek, Eyal
    Pahud, Michel
    Kranz, Matthias
    Kristensson, Per Ola
    25TH 2018 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), 2018, : 159 - 166
  • [29] Emotion Classification from Speech and Text in Videos Using a Multimodal Approach
    Caschera, Maria Chiara
    Grifoni, Patrizia
    Ferri, Fernando
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2022, 6 (04)
  • [30] Speech recognition for command entry in multimodal interaction
    Tyfa, DA
    Howes, M
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2000, 52 (04) : 637 - 667