Towards Better Communication: Refining Hand Pose Estimation in Low-Resolution Sign Language Videos

被引:0
|
作者
Tasyurek, Umeyye Meryem [1 ]
Kiziltepe, Tugce [1 ]
Keles, Hacer Yalim [1 ]
机构
[1] Hacettepe Univ, Dept Comp Engn, Ankara, Turkiye
关键词
D O I
10.1109/FG59268.2024.10582003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we present a novel methodology that enhances hand keypoint extraction in low-resolution sign language datasets, a challenge that has been largely unexplored in sign language research. By addressing the limitations of existing pose extraction models like OpenPose and MediaPipe, which frequently struggle with accurately detecting hand keypoints in low-resolution footage, our method marks a notable advancement in this specialized field. Our methodology adapts the U-Net and Attention U-Net architectures to improve the resolution of sign language videos while reducing undetected hand presence (UHP) in low-resolution footage. The key innovation focuses on hand movements through a progressive training procedure, utilizing datasets from SRF DSGS and ShowTV Main News domains. Through comprehensive experimentation and cross-dataset evaluations, our findings demonstrate a significant reduction in the UHP ratio, notably in the Attention U-Net model with our proposed loss function, tailored to enhance hand keypoints detection. In our benchmark tests, using low-resolution TV news broadcasts, our fine-tuned models, particularly the BWA-UNet, showed marked improvements in hand keypoint accuracy compared to standard upsampling methods. These results underscore the effectiveness of our approach in practical, real-world scenarios, highlighting its potential to substantially improve hand keypoint detection in sign language videos.
引用
收藏
页数:10
相关论文
共 23 条
  • [1] Low-resolution human pose estimation
    Wang, Chen
    Zhang, Feng
    Zhu, Xiatian
    Ge, Shuzhi Sam
    [J]. PATTERN RECOGNITION, 2022, 126
  • [2] Automatic and Efficient Human Pose Estimation for Sign Language Videos
    Charles, James
    Pfister, Tomas
    Everingham, Mark
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2014, 110 (01) : 70 - 90
  • [3] Automatic and Efficient Human Pose Estimation for Sign Language Videos
    James Charles
    Tomas Pfister
    Mark Everingham
    Andrew Zisserman
    [J]. International Journal of Computer Vision, 2014, 110 : 70 - 90
  • [4] Hand pose estimation for American Sign Language recognition
    Isaacs, J
    Foo, S
    [J]. PROCEEDINGS OF THE THIRTY-SIXTH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 2004, : 132 - 136
  • [5] Stable 3D Human Pose Estimation in Low-Resolution Videos with a Few Views
    Nakatsuka, Chihiro
    Komorita, Satoshi
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022,
  • [6] Optimized wavelet hand pose estimation for American sign language recognition
    Isaacs, J
    Foo, S
    [J]. CEC2004: PROCEEDINGS OF THE 2004 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2004, : 797 - 802
  • [7] Free-Head Pose Estimation under Low-Resolution Scenarios
    Liu, Jingjing
    Wang, Zhiyong
    Qin, Haibo
    Xu, Kai
    Ji, Bin
    Liu, Honghai
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2277 - 2283
  • [8] Joint Super-Resolution and Head Pose Estimation for Extreme Low-Resolution Faces
    Malakshan, Sahar Rahimi
    Saadabadi, Mohammad Saeed Ebrahimi
    Mostofa, Moktari
    Soleymani, Sobhan
    Nasrabadi, Nasser M.
    [J]. IEEE ACCESS, 2023, 11 : 11238 - 11253
  • [9] Multi-resolution Fusion Network for Human Pose Estimation in Low-resolution Images
    Kim, Boeun
    Choo, YeonSeung
    Jeong, Hea In
    Kim, Chung-Il
    Shin, Saim
    Kim, Jungho
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (07) : 2328 - 2344
  • [10] 3D Human Pose, Shape and Texture From Low-Resolution Images and Videos
    Xu, Xiangyu
    Chen, Hao
    Moreno-Noguer, Francesc
    Jeni, Laszlo A.
    De la Torre, Fernando
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (09) : 4490 - 4504