RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework

被引:0
|
作者
Shen, Junxiao [1 ,2 ]
Boldu, Roger [1 ]
Kalla, Arpit [1 ]
Glueck, Michael [1 ]
Surale, Hemant Bhaskar [1 ]
Karlson, Amy [1 ]
机构
[1] Meta, Real Labs Res, Menlo Pk, CA 94025 USA
[2] Univ Bristol, Bristol, England
关键词
Tracking; Keyboards; Context modeling; Wrist; Predictive models; Decoding; Trajectory; Text entry; augmented reality; word prediction; language models; INPUT; PERFORMANCE;
D O I
10.1109/TVCG.2024.3456163
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Text entry is a critical capability for any modern computing experience, with lightweight augmented reality (AR) glasses being no exception. Designed for all-day wearability, a limitation of lightweight AR glass is the restriction to the inclusion of multiple cameras for extensive field of view in hand tracking. This constraint underscores the need for an additional input device. We propose a system to address this gap: a ring-based mid-air gesture typing technique, RingGesture, utilizing electrodes to mark the start and end of gesture trajectories and inertial measurement units (IMU) sensors for hand tracking. This method offers an intuitive experience similar to raycast-based mid-air gesture typing found in VR headsets, allowing for a seamless translation of hand movements into cursor navigation. To enhance both accuracy and input speed, we propose a novel deep-learning word prediction framework, Score Fusion, comprised of three key components: a) a word-gesture decoding model, b) a spatial spelling correction model, and c) a lightweight contextual language model. In contrast, this framework fuses the scores from the three models to predict the most likely words with higher precision. We conduct comparative and longitudinal studies to demonstrate two key findings: firstly, the overall effectiveness of RingGesture, which achieves an average text entry speed of 27.3 words per minute (WPM) and a peak performance of 47.9 WPM. Secondly, we highlight the superior performance of the Score Fusion framework, which offers a 28.2% improvement in uncorrected Character Error Rate over a conventional word prediction framework, Naive Correction, leading to a 55.2% improvement in text entry speed for RingGesture. Additionally, RingGesture received a System Usability Score of 83 signifying its excellent usability.
引用
收藏
页码:7441 / 7451
页数:11
相关论文
共 28 条
  • [21] Suicidal ideation prediction based on social media posts using a GAN-infused deep learning framework with genetic optimization and word embedding fusion
    Kancharapu R.
    Ayyagari S.N.
    International Journal of Information Technology, 2024, 16 (4) : 2577 - 2593
  • [22] Deep learning-based prediction method on performance change of air source heat pump system under frosting conditions
    Eom, Yong Hwan
    Chung, Yoong
    Park, Minsu
    Hong, Sung Bin
    Kim, Min Soo
    ENERGY, 2021, 228
  • [23] Deep learning-based inversion framework by assimilating hydrogeological and geophysical data for an enhanced geothermal system characterization and thermal performance prediction
    Chen, Cihai
    Deng, Yaping
    Ma, Haichun
    Kang, Xueyuan
    Ma, Lei
    Qian, Jiazhong
    ENERGY, 2024, 302
  • [24] Two-stage deep learning hybrid framework based on multi-factor multi-scale and intelligent optimization for air pollutant prediction and early warning
    Wang, Jujie
    Xu, Wenjie
    Dong, Jian
    Zhang, Yue
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2022, 36 (10) : 3417 - 3437
  • [25] Two-stage deep learning hybrid framework based on multi-factor multi-scale and intelligent optimization for air pollutant prediction and early warning
    Jujie Wang
    Wenjie Xu
    Jian Dong
    Yue Zhang
    Stochastic Environmental Research and Risk Assessment, 2022, 36 : 3417 - 3437
  • [26] A Multi-Modal Deep-Learning Air Quality Prediction Method Based on Multi-Station Time-Series Data and Remote-Sensing Images: Case Study of Beijing and Tianjin
    Xia, Hanzhong
    Chen, Xiaoxia
    Wang, Zhen
    Chen, Xinyi
    Dong, Fangyan
    ENTROPY, 2024, 26 (01)
  • [27] A multimodal AI-based non-invasive COVID-19 grading framework powered by deep learning, manta ray, and fuzzy inference system from multimedia vital signs
    Almutairi, Saleh Ateeq
    HELIYON, 2023, 9 (06)
  • [28] Deep-Learning-Based Multi-Timestamp Multi-Location PM2.5 Prediction: Verification by Using a Mobile Monitoring System With an IoT Framework Deployed in the Urban Zone of a Metropolitan Area
    Chiang, Yu-Lun
    Wang, Jen-Cheng
    Lee, Mu-Hwa
    Liu, An-Chi
    Jiang, Joe-Air
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (05): : 8815 - 8837