Asynchronous Multimodal Text Entry using Speech and Gesture Keyboards

被引：0

作者：

Kristensson, Per Ola ^{[1
]}

Vertanen, Keith ^{[2
]}

机构：

[1] Univ St Andrews, Sch Comp Sci, St Andrews KY16 9AJ, Fife, Scotland

[2] Princeton Univ, Dept Comp Sci, Princeton, NJ USA

来源：

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | 2011年

基金：

英国工程与自然科学研究理事会;

关键词：

mobile text entry; multimodal interfaces;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose reducing errors in text entry by combining speech and gesture keyboard input. We describe a merge model that combines recognition results in an asynchronous and flexible manner. We collected speech and gesture data of users entering both short email sentences and web search queries. By merging recognition results from both modalities, word error rate was reduced by 53% relative for email sentences and 29% relative for web searches. For email utterances with speech errors, we investigated providing gesture keyboard corrections of only the erroneous words. Without the user explicitly indicating the incorrect words, our model was able to reduce the word error rate by 44% relative.

引用

页码：588 / +

页数：2

共 50 条

[31] Generating coherent spontaneous speech and gesture from text
Alexanderson, Simon
Szekely, Eva
Henter, Gustav Eje
Kucherenko, Taras
Beskow, Jonas
PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (ACM IVA 2020), 2020,
[32] Multimodal Processing of Speech: The Relationship Between Beat Gesture Synchrony and Speech Comprehension
Romano, Josh
Chambers, Craig
CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2014, 68 (04): : 300 - 300
[33] Towards Utilizing Touch-sensitive Physical Keyboards for Text Entry in Virtual Reality
Otte, Alexander
Menzner, Tim
Gesslein, Travis
Gagel, Philipp
Schneider, Daniel
Grubert, Jens
2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), 2019, : 1729 - 1732
[34] AirStroke: Bringing Unistroke Text Entry to Freehand Gesture Interfaces
Ni, Tao
Bowman, Doug A.
North, Chris
29TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2011, : 2473 - 2476
[35] SpeeG2: A Speech- and Gesture-based Interface for Efficient Controller-free Text Entry
Hoste, Lode
Signer, Beat
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 213 - 220
[36] Multimodal Speaker Identification Based on Text and Speech
Moschonas, Panagiotis
Kotropoulos, Constantine
BIOMETRICS AND IDENTITY MANAGEMENT, 2008, 5372 : 100 - 109
[37] Speech-gesture driven multimodal interfaces for crisis management
Sharma, R
Yeasin, M
Krahnstoever, N
Rauschert, I
Cai, G
Brewer, I
MacEachren, AM
Sengupta, K
PROCEEDINGS OF THE IEEE, 2003, 91 (09) : 1327 - 1354
[38] Multimodal Fusion: Gesture and Speech Input in Augmented Reality Environment
Ismail, Ajune Wanis
Sunar, Mohd Shahrizal
COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS, 2015, 331 : 245 - 254
[39] MULTIMODAL COMMUNICATION FROM MULTIMODAL THINKING - TOWARDS AN INTEGRATED MODEL OF SPEECH AND GESTURE PRODUCTION
Kopp, Stefan
Bergmann, Kirsten
Wachsmuth, Ipke
INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2008, 2 (01) : 115 - 136
[40] Alzheimer's Dementia Recognition Using Multimodal Fusion of Speech and Text Embeddings
Pandey, Sandeep Kumar
Shekhawat, Hanumant Singh
Bhasin, Shalendar
Jasuja, Ravi
Prasanna, S. R. M.
INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2021, 2022, 13184 : 718 - 728

← 1 2 3 4 5 →