Asynchronous Multimodal Text Entry using Speech and Gesture Keyboards

被引:0
|
作者
Kristensson, Per Ola [1 ]
Vertanen, Keith [2 ]
机构
[1] Univ St Andrews, Sch Comp Sci, St Andrews KY16 9AJ, Fife, Scotland
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ USA
基金
英国工程与自然科学研究理事会;
关键词
mobile text entry; multimodal interfaces;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose reducing errors in text entry by combining speech and gesture keyboard input. We describe a merge model that combines recognition results in an asynchronous and flexible manner. We collected speech and gesture data of users entering both short email sentences and web search queries. By merging recognition results from both modalities, word error rate was reduced by 53% relative for email sentences and 29% relative for web searches. For email utterances with speech errors, we investigated providing gesture keyboard corrections of only the erroneous words. Without the user explicitly indicating the incorrect words, our model was able to reduce the word error rate by 44% relative.
引用
收藏
页码:588 / +
页数:2
相关论文
共 50 条
  • [31] Generating coherent spontaneous speech and gesture from text
    Alexanderson, Simon
    Szekely, Eva
    Henter, Gustav Eje
    Kucherenko, Taras
    Beskow, Jonas
    PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (ACM IVA 2020), 2020,
  • [32] Multimodal Processing of Speech: The Relationship Between Beat Gesture Synchrony and Speech Comprehension
    Romano, Josh
    Chambers, Craig
    CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2014, 68 (04): : 300 - 300
  • [33] Towards Utilizing Touch-sensitive Physical Keyboards for Text Entry in Virtual Reality
    Otte, Alexander
    Menzner, Tim
    Gesslein, Travis
    Gagel, Philipp
    Schneider, Daniel
    Grubert, Jens
    2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), 2019, : 1729 - 1732
  • [34] AirStroke: Bringing Unistroke Text Entry to Freehand Gesture Interfaces
    Ni, Tao
    Bowman, Doug A.
    North, Chris
    29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2011, : 2473 - 2476
  • [35] SpeeG2: A Speech- and Gesture-based Interface for Efficient Controller-free Text Entry
    Hoste, Lode
    Signer, Beat
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 213 - 220
  • [36] Multimodal Speaker Identification Based on Text and Speech
    Moschonas, Panagiotis
    Kotropoulos, Constantine
    BIOMETRICS AND IDENTITY MANAGEMENT, 2008, 5372 : 100 - 109
  • [37] Speech-gesture driven multimodal interfaces for crisis management
    Sharma, R
    Yeasin, M
    Krahnstoever, N
    Rauschert, I
    Cai, G
    Brewer, I
    MacEachren, AM
    Sengupta, K
    PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1327 - 1354
  • [38] Multimodal Fusion: Gesture and Speech Input in Augmented Reality Environment
    Ismail, Ajune Wanis
    Sunar, Mohd Shahrizal
    COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS, 2015, 331 : 245 - 254
  • [39] MULTIMODAL COMMUNICATION FROM MULTIMODAL THINKING - TOWARDS AN INTEGRATED MODEL OF SPEECH AND GESTURE PRODUCTION
    Kopp, Stefan
    Bergmann, Kirsten
    Wachsmuth, Ipke
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2008, 2 (01) : 115 - 136
  • [40] Alzheimer's Dementia Recognition Using Multimodal Fusion of Speech and Text Embeddings
    Pandey, Sandeep Kumar
    Shekhawat, Hanumant Singh
    Bhasin, Shalendar
    Jasuja, Ravi
    Prasanna, S. R. M.
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2021, 2022, 13184 : 718 - 728