Pedestrian Attribute Recognition Based on Multimodal Transformer

被引:0
|
作者
Liu, Dan [1 ]
Song, Wei [1 ,2 ,3 ]
Zhao, Xiaobing [1 ,3 ]
机构
[1] Minzu Univ China, Sch Informat Engn, Beijing 100081, Peoples R China
[2] Minzu Univ China, Key Lab Ethn Language Intelligent Anal & Secur Go, MOE, Beijing 100081, Peoples R China
[3] Minzu Univ China, Natl Lauguage Resource Monitoring & Res Ctr Minor, Beijing 100081, Peoples R China
关键词
Pedestrian Attribute Recognition; Multimodal Learning; Transformer;
D O I
10.1007/978-981-99-8429-9_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pedestrian attribute recognition (PAR) is susceptible to variable shooting angles, lighting, and occlusions. Improving recognition accuracy to suit its application in various complex scenarios is one of the most important tasks. In this paper, based on the Image-Text Multimodal Transformer, the intra-modal and inter-modal correlations are learned from pedestrian images and attribute labels. The applicability of six different multimodal fusion frameworks for attribute recognition is explored. The impact of different frameworks' fused feature division methods on recognition accuracy is compared and analyzed. The comparative experiments verify the robustness and efficiency of the Early Concatenate framework, which has achieved multiple best metric scores on the two major public PAR datasets, PA100k and RAP. This paper not only proposes a new Transformer-based high-accuracy multimodal network, but also provides feasible ideas and directions for further research on PAR. The comparative discussion based on various multimodal frame-works also provides a perspective that can be learned for other multimodal tasks.
引用
下载
收藏
页码:422 / 433
页数:12
相关论文
共 50 条
  • [41] STDP-Net: Improved Pedestrian Attribute Recognition Using Swin Transformer and Semantic Self-Attention
    Lee, Geonu
    Cho, Jungchan
    IEEE ACCESS, 2022, 10 : 82656 - 82667
  • [42] Explicit State Representation Guided Video-based Pedestrian Attribute Recognition
    Lu, Wei-Qing
    Hu, Hai-Miao
    Yu, Jinzuo
    Zhang, Shifeng
    Wang, Hanzi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (01)
  • [43] Orientation-Aware Pedestrian Attribute Recognition Based on Graph Convolution Network
    Lu, Wei-Qing
    Hu, Hai-Miao
    Yu, Jinzuo
    Zhou, Yibo
    Wang, Hanzi
    Li, Bo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 28 - 40
  • [44] Pedestrian Attribute Recognition Model based on Adaptive Weight and Depthwise Separable Convolutions
    Lin, Xiao
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 830 - 833
  • [45] Pedestrian Attribute Recognition with Part-based CNN and Combined Feature Representations
    Chen, Yiqiang
    Duffner, Stefan
    Stoian, Andrei
    Dufour, Jean-Yves
    Baskurt, Atilla
    PROCEEDINGS OF THE 13TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISIGRAPP 2018), VOL 5: VISAPP, 2018, : 114 - 122
  • [46] Overview of deep learning based pedestrian attribute recognition and re-identification
    Wu, Duidi
    Huang, Haiqing
    Zhao, Qianyou
    Zhang, Shuo
    Qi, Jin
    Hu, Jie
    HELIYON, 2022, 8 (12)
  • [47] Pedestrian Attribute Recognition Algorithm Based on Multi-Scale Attention Network
    Li Na
    Wu Yangyang
    Liu Ying
    Xing Jin
    LASER & OPTOELECTRONICS PROGRESS, 2021, 58 (04)
  • [48] Multimodal Transformer for Nursing Activity Recognition
    Ijaz, Momal
    Diaz, Renato
    Chen, Chen
    arXiv, 2022,
  • [49] Multimodal Transformer for Nursing Activity Recognition
    Ijaz, Momal
    Diaz, Renato
    Chen, Chen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2064 - 2073
  • [50] Selective and Orthogonal Feature Activation for Pedestrian Attribute Recognition
    Wu, Junyi
    Huang, Yan
    Gao, Min
    Niu, Yuzhen
    Yang, Mingjing
    Gao, Zhipeng
    Zhao, Jianqiang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6039 - 6047