Fine-grained activity classification in assembly based on multi-visual modalities

被引:7
|
作者
Chen, Haodong [1 ]
Zendehdel, Niloofar [1 ]
Leu, Ming C. [1 ]
Yin, Zhaozheng [2 ,3 ]
机构
[1] Missouri Univ Sci & Technol, Dept Mech & Aerosp Engn, Rolla, MO 65409 USA
[2] SUNY Stony Brook, Dept Biomed Informat, Stony Brook, NY USA
[3] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
基金
美国国家科学基金会;
关键词
Fine-grained activity; Activity classification; Assembly; Multi-visual modality; RECOGNITION; LSTM;
D O I
10.1007/s10845-023-02152-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assembly activity recognition and prediction help to improve productivity, quality control, and safety measures in smart factories. This study aims to sense, recognize, and predict a worker's continuous fine-grained assembly activities in a manufacturing platform. We propose a two-stage network for workers' fine-grained activity classification by leveraging scene-level and temporal-level activity features. The first stage is a feature awareness block that extracts scene-level features from multi-visual modalities, including red-green-blue (RGB) and hand skeleton frames. We use the transfer learning method in the first stage and compare three different pre-trained feature extraction models. Then, we transmit the feature information from the first stage to the second stage to learn the temporal-level features of activities. The second stage consists of the Recurrent Neural Network (RNN) layers and a final classifier. We compare the performance of two different RNNs in the second stage, including the Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU). The partial video observation method is used in the prediction of fine-grained activities. In the experiments using the trimmed activity videos, our model achieves an accuracy of > 99% on our dataset and > 98% on the public dataset UCF 101, outperforming the state-of-the-art models. The prediction model achieves an accuracy of > 97% in predicting activity labels using 50% of the onset activity video information. In the experiments using an untrimmed video with continuous assembly activities, we combine our recognition and prediction models and achieve an accuracy of > 91% in real time, surpassing the state-of-the-art models for the recognition of continuous assembly activities.
引用
收藏
页码:2215 / 2233
页数:19
相关论文
共 50 条
  • [31] Fine-grained Image Classification by Visual-Semantic Embedding
    Xu, Huapeng
    Qi, Guilin
    Li, Jingjing
    Wang, Meng
    Xu, Kang
    Gao, Huan
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1043 - 1049
  • [32] Visual Analytics for Fine-grained Text Classification Models and Datasets
    Battogtokh, M.
    Xing, Y.
    Davidescu, C.
    Abdul-Rahman, A.
    Luck, M.
    Borgo, R.
    COMPUTER GRAPHICS FORUM, 2024, 43 (03)
  • [33] Diversified Visual Attention Networks for Fine-Grained Object Classification
    Zhao, Bo
    Wu, Xiao
    Feng, Jiashi
    Peng, Qiang
    Yan, Shuicheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (06) : 1245 - 1256
  • [34] Adversarially attack feature similarity for fine-grained visual classification
    Wang, Yupeng
    Xu, Can
    Wang, Yongli
    Wang, Xiaoli
    Ding, Weiping
    APPLIED SOFT COMPUTING, 2024, 163
  • [35] Diagnosing Necrotizing Enterocolitis via Fine-Grained Visual Classification
    Yung, Ka-Wai
    Sivaraj, Jayaram
    De Coppi, Paolo
    Stoyanov, Danail
    Loukogeorgakis, Stavros
    Mazomenos, Evangelos B.
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2024, 71 (11) : 3160 - 3169
  • [36] Kernelizing Spatially Consistent Visual Matches for Fine-Grained Classification
    Leveau, Valentin
    Joly, Alexis
    Buisson, Olivier
    Valduriez, Patrick
    ICMR'15: PROCEEDINGS OF THE 2015 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2015, : 155 - 162
  • [37] Feature Boosting, Suppression, and Diversification for Fine-Grained Visual Classification
    Song, Jianwei
    Yang, Ruoyu
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [38] WEB-SUPERVISED NETWORK FOR FINE-GRAINED VISUAL CLASSIFICATION
    Zhang, Chuanyi
    Ya, Yazhou
    Zhang, Jiachao
    Chen, Jiaxin
    Huang, Pu
    Zhang, Jian
    Tang, Zhenmin
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [39] SemLa: A Visual Analysis System for Fine-Grained Text Classification
    Battogtokh, Munkhtulga
    Davidescu, Cosmin
    Luck, Michael
    Borgo, Rita
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23772 - 23774
  • [40] Fine-Grained Visual Classification using Self Assessment Classifier
    Do, Tuong
    Trani, Huy
    Tjiputra, Erman
    Tran, Quang D.
    Anh Nguyen
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 597 - 602