RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework

被引：0

作者：

Shen, Junxiao ^{[1
,2
]}

Boldu, Roger ^{[1
]}

Kalla, Arpit ^{[1
]}

Glueck, Michael ^{[1
]}

Surale, Hemant Bhaskar ^{[1
]}

Karlson, Amy ^{[1
]}

机构：

[1] Meta, Real Labs Res, Menlo Pk, CA 94025 USA

[2] Univ Bristol, Bristol, England

来源：

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS | 2024年 / 30卷 / 11期

关键词：

Tracking; Keyboards; Context modeling; Wrist; Predictive models; Decoding; Trajectory; Text entry; augmented reality; word prediction; language models; INPUT; PERFORMANCE;

D O I：

10.1109/TVCG.2024.3456163

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Text entry is a critical capability for any modern computing experience, with lightweight augmented reality (AR) glasses being no exception. Designed for all-day wearability, a limitation of lightweight AR glass is the restriction to the inclusion of multiple cameras for extensive field of view in hand tracking. This constraint underscores the need for an additional input device. We propose a system to address this gap: a ring-based mid-air gesture typing technique, RingGesture, utilizing electrodes to mark the start and end of gesture trajectories and inertial measurement units (IMU) sensors for hand tracking. This method offers an intuitive experience similar to raycast-based mid-air gesture typing found in VR headsets, allowing for a seamless translation of hand movements into cursor navigation. To enhance both accuracy and input speed, we propose a novel deep-learning word prediction framework, Score Fusion, comprised of three key components: a) a word-gesture decoding model, b) a spatial spelling correction model, and c) a lightweight contextual language model. In contrast, this framework fuses the scores from the three models to predict the most likely words with higher precision. We conduct comparative and longitudinal studies to demonstrate two key findings: firstly, the overall effectiveness of RingGesture, which achieves an average text entry speed of 27.3 words per minute (WPM) and a peak performance of 47.9 WPM. Secondly, we highlight the superior performance of the Score Fusion framework, which offers a 28.2% improvement in uncorrected Character Error Rate over a conventional word prediction framework, Naive Correction, leading to a 55.2% improvement in text entry speed for RingGesture. Additionally, RingGesture received a System Usability Score of 83 signifying its excellent usability.

引用

页码：7441 / 7451

页数：11

共 28 条

[21] Suicidal ideation prediction based on social media posts using a GAN-infused deep learning framework with genetic optimization and word embedding fusion
Kancharapu R.
Ayyagari S.N.
International Journal of Information Technology, 2024, 16 (4) : 2577 - 2593
[22] Deep learning-based prediction method on performance change of air source heat pump system under frosting conditions
Eom, Yong Hwan
Chung, Yoong
Park, Minsu
Hong, Sung Bin
Kim, Min Soo
ENERGY, 2021, 228
[23] Deep learning-based inversion framework by assimilating hydrogeological and geophysical data for an enhanced geothermal system characterization and thermal performance prediction
Chen, Cihai
Deng, Yaping
Ma, Haichun
Kang, Xueyuan
Ma, Lei
Qian, Jiazhong
ENERGY, 2024, 302
[24] Two-stage deep learning hybrid framework based on multi-factor multi-scale and intelligent optimization for air pollutant prediction and early warning
Wang, Jujie
Xu, Wenjie
Dong, Jian
Zhang, Yue
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2022, 36 (10) : 3417 - 3437
[25] Two-stage deep learning hybrid framework based on multi-factor multi-scale and intelligent optimization for air pollutant prediction and early warning
Jujie Wang
Wenjie Xu
Jian Dong
Yue Zhang
Stochastic Environmental Research and Risk Assessment, 2022, 36 : 3417 - 3437
[26] A Multi-Modal Deep-Learning Air Quality Prediction Method Based on Multi-Station Time-Series Data and Remote-Sensing Images: Case Study of Beijing and Tianjin
Xia, Hanzhong
Chen, Xiaoxia
Wang, Zhen
Chen, Xinyi
Dong, Fangyan
ENTROPY, 2024, 26 (01)
[27] A multimodal AI-based non-invasive COVID-19 grading framework powered by deep learning, manta ray, and fuzzy inference system from multimedia vital signs
Almutairi, Saleh Ateeq
HELIYON, 2023, 9 (06)
[28] Deep-Learning-Based Multi-Timestamp Multi-Location PM2.5 Prediction: Verification by Using a Mobile Monitoring System With an IoT Framework Deployed in the Urban Zone of a Metropolitan Area
Chiang, Yu-Lun
Wang, Jen-Cheng
Lee, Mu-Hwa
Liu, An-Chi
Jiang, Joe-Air
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (05): : 8815 - 8837

← 1 2 3 →