Fine-grained activity classification in assembly based on multi-visual modalities

被引:7
|
作者
Chen, Haodong [1 ]
Zendehdel, Niloofar [1 ]
Leu, Ming C. [1 ]
Yin, Zhaozheng [2 ,3 ]
机构
[1] Missouri Univ Sci & Technol, Dept Mech & Aerosp Engn, Rolla, MO 65409 USA
[2] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY USA
[3] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
基金
美国国家科学基金会;
关键词
Fine-grained activity; Activity classification; Assembly; Multi-visual modality; RECOGNITION; LSTM;
D O I
10.1007/s10845-023-02152-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assembly activity recognition and prediction help to improve productivity, quality control, and safety measures in smart factories. This study aims to sense, recognize, and predict a worker's continuous fine-grained assembly activities in a manufacturing platform. We propose a two-stage network for workers' fine-grained activity classification by leveraging scene-level and temporal-level activity features. The first stage is a feature awareness block that extracts scene-level features from multi-visual modalities, including red-green-blue (RGB) and hand skeleton frames. We use the transfer learning method in the first stage and compare three different pre-trained feature extraction models. Then, we transmit the feature information from the first stage to the second stage to learn the temporal-level features of activities. The second stage consists of the Recurrent Neural Network (RNN) layers and a final classifier. We compare the performance of two different RNNs in the second stage, including the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU). The partial video observation method is used in the prediction of fine-grained activities. In the experiments using the trimmed activity videos, our model achieves an accuracy of > 99% on our dataset and > 98% on the public dataset UCF 101, outperforming the state-of-the-art models. The prediction model achieves an accuracy of > 97% in predicting activity labels using 50% of the onset activity video information. In the experiments using an untrimmed video with continuous assembly activities, we combine our recognition and prediction models and achieve an accuracy of > 91% in real time, surpassing the state-of-the-art models for the recognition of continuous assembly activities.
引用
收藏
页码:2215 / 2233
页数:19
相关论文
共 50 条
  • [21] ALMA: Adjustable Location and Multi-Angle Attention for Fine-Grained Visual Classification
    Ding, Boyu
    Xu, Xiaofeng
    Bao, Xianglin
    Yan, Nan
    Zhang, Ruiheng
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2967 - 2972
  • [22] Fine-Grained Visual Classification Based on Sparse Bilinear Convolutional Neural Network
    Ma L.
    Wang Y.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (04): : 336 - 344
  • [23] Fine-Grained Visual Classification Network Based on Fusion Pooling and Attention Enhancement
    Xiao B.
    Guo J.
    Zhang X.
    Wang M.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (07): : 661 - 670
  • [24] Fine-grained sentiment classification based on HowNet
    Li, Wen
    Chen, Yuefeng
    Wang, Weili
    Journal of Convergence Information Technology, 2012, 7 (19) : 86 - 92
  • [25] A Progressive Gated Attention Model for Fine-Grained Visual Classification
    Zhu, Qiangxi
    Li, Zhixin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2063 - 2068
  • [26] Learning Hierarchal Channel Attention for Fine-grained Visual Classification
    Guan, Xiang
    Wang, Guoqing
    Xu, Xing
    Bin, Yi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5011 - 5019
  • [27] Hierarchical attention vision transformer for fine-grained visual classification
    Hu, Xiaobin
    Zhu, Shining
    Peng, Taile
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 91
  • [28] Using Coarse Label Constraint for Fine-Grained Visual Classification
    Lu, Chaohao
    Zou, Yuexian
    MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 266 - 277
  • [29] A collaborative gated attention network for fine-grained visual classification
    Zhu, Qiangxi
    Kuang, Wenlan
    Li, Zhixin
    DISPLAYS, 2023, 79
  • [30] Symmetrical irregular local features for fine-grained visual classification
    Yang, Ming
    Xu, Yang
    Wu, Zebin
    Wei, Zhihui
    NEUROCOMPUTING, 2022, 505 : 304 - 314