PerceptGuide: A Perception Driven Assistive Mobility Aid Based on Self-Attention and Multi-Scale Feature Fusion

被引:1
|
作者
Madake, Jyoti [1 ]
Bhatlawande, Shripad [1 ]
Solanke, Anjali [2 ]
Shilaskar, Swati [1 ]
机构
[1] Vishwakarma Inst Technol, Pune 411037, India
[2] Marathwada Mitra Mandals Coll Engn, Pune 411052, India
关键词
Blind assistive; mobility aid; scene understanding; wearable aid; Resnet-50; feature fusion; self-attention; multilayer GRU; SYSTEM; BLIND;
D O I
10.1109/ACCESS.2023.3314702
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper introduces a novel wearable aid, PerceptGuide to help for visually impaired individuals to perceive scene around them. It is designed as a wearable, light weight chest rig bag, that incorporates a monocular camera, ultrasonic sensors, vibration motors, and a mono-earphone, powered by an embedded Nvidia Jetson development board. The system provides directional obstacle alerts through the vibration motors, allowing users to avoid obstacles on their path. A user-friendly push-button enables user to inquire about scene information in front of them. The scene details are effectively conveyed through a novel scene understanding approach, that combines multi-scale feature fusion, self-attention models, and a multilayer GRU (Gated Recurrent Unit) architecture on the ResNet50 backbone. The proposed system generates coherent and descriptive captions by capturing image features at different scales, enhancing the quality and contextual understanding of the scene details. The self-attention in both the encoder (ResNet50 + Feature fusion model) and decoder (multilayer GRU), effectively captures long-range dependencies and attend to relevant image regions. The quantitative evaluations conducted on the MSCOCO and Flicker8k datasets show the effectiveness of the model with improved Bleu-67.7, RougeL - 47.6, Meteor - 22.7 and CIEDR-67.4 scores. The PerceptGuide system exhibits exceptional real-time performance, generating audible captions in just 1.5 to 2 seconds. This rapid response time significantly aids visually impaired individuals in understanding the scenes around them. The qualitative evaluation of the aid emphasizes its real-time performance, demonstrating the generation of context-aware, semantically meaningful captions. This validates its potential as a wearable assistive aid for visually impaired people, with the added advantages of low power consumption, compactness, and a lightweight design.
引用
收藏
页码:101167 / 101182
页数:16
相关论文
共 50 条
  • [1] Detection of Rice Pests Based on Self-Attention Mechanism and Multi-Scale Feature Fusion
    Hu, Yuqi
    Deng, Xiaoling
    Lan, Yubin
    Chen, Xin
    Long, Yongbing
    Liu, Cunjia
    [J]. INSECTS, 2023, 14 (03)
  • [2] Multi-scale pedestrian detection based on self-attention and adaptively spatial feature fusion
    Wang, Minjun
    Chen, Houjin
    Li, Yanfeng
    You, Yuhao
    Zhu, Jinlei
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2021, 15 (06) : 837 - 849
  • [3] Tea Disease Detection Method with Multi-scale Self-attention Feature Fusion
    Sun, Yange
    Wu, Fei
    Yao, Jianfeng
    Zhou, Qiying
    Shen, Jianbo
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 (12): : 309 - 315
  • [4] Clothing Parsing Based on Multi-Scale Fusion and Improved Self-Attention Mechanism
    陈诺
    王绍宇
    陆然
    李文萱
    覃志东
    石秀金
    [J]. Journal of Donghua University(English Edition), 2023, 40 (06) : 661 - 666
  • [5] Multi-scale quaternion CNN and BiGRU with cross self-attention feature fusion for fault diagnosis of bearing
    Liu, Huanbai
    Zhang, Fanlong
    Tan, Yin
    Huang, Lian
    Li, Yan
    Huang, Guoheng
    Luo, Shenghong
    Zeng, An
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
  • [6] Multi-Scale Self-Attention for Text Classification
    Guo, Qipeng
    Qiu, Xipeng
    Liu, Pengfei
    Xue, Xiangyang
    Zhang, Zheng
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7847 - 7854
  • [7] MSSA-Net: A novel multi-scale feature fusion and global self-attention network for lesion segmentation
    Huang, Zhaohong
    Zhang, Xiangchen
    Zhang, Guowei
    Cai, Guorong
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (21):
  • [8] Prediction of Large-Scale Regional Evapotranspiration Based on Multi-Scale Feature Extraction and Multi-Headed Self-Attention
    Zheng, Xin
    Zhang, Sha
    Zhang, Jiahua
    Yang, Shanshan
    Huang, Jiaojiao
    Meng, Xianye
    Bai, Yun
    [J]. REMOTE SENSING, 2024, 16 (07)
  • [9] Multi-scale self-attention mixup for graph classification *
    Kong, Youyong
    Li, Jiaxing
    Zhang, Ke
    Wu, Jiasong
    [J]. PATTERN RECOGNITION LETTERS, 2023, 168 : 100 - 106
  • [10] Self-Attention-based Multi-Scale Feature Fusion Network for Road Ponding Segmentation
    Yang, Shangyu
    Zhang, Ronghui
    Sun, Wencai
    Chen, Shengru
    Ye, Cong
    Wu, Hao
    Li, Mengran
    [J]. 2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,