Enhanced Context Learning with Transformer for Human Parsing

被引:1
|
作者
Song, Jingya [1 ,2 ,3 ]
Shi, Qingxuan [1 ,2 ,3 ]
Li, Yihang [1 ,2 ,3 ]
Yang, Fang [1 ,2 ,3 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China
[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期
关键词
human parsing; semantic segmentation; deep learning; SEGMENTATION;
D O I
10.3390/app12157821
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Feature context learning for human parsing
    Tengteng Huang
    Yongchao Xu
    Song Bai
    Yongpan Wang
    Xiang Bai
    Science China Information Sciences, 2019, 62
  • [2] Feature context learning for human parsing
    Tengteng HUANG
    Yongchao XU
    Song BAI
    Yongpan WANG
    Xiang BAI
    Science China(Information Sciences), 2019, 62 (12) : 6 - 19
  • [3] Feature context learning for human parsing
    Huang, Tengteng
    Xu, Yongchao
    Bai, Song
    Wang, Yongpan
    Bai, Xiang
    SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (12)
  • [4] Context-Enhanced Stereo Transformer
    Guo, Weiyu
    Li, Zhaoshuo
    Yang, Yongkui
    Wang, Zheng
    Taylor, Russell H.
    Unberath, Mathias
    Yuille, Alan
    Li, Yingwei
    COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 263 - 279
  • [5] Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal Dependency Parsing
    He, Han
    Choi, Jinho D.
    16TH INTERNATIONAL CONFERENCE ON PARSING TECHNOLOGIES AND IWPT 2020 SHARED TASK ON PARSING INTO ENHANCED UNIVERSAL DEPENDENCIES, 2020, : 181 - 191
  • [6] Learning Hierarchical Poselets for Human Parsing
    Wang, Yang
    Tran, Duan
    Liao, Zicheng
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 1705 - 1712
  • [7] Transformer-basedWorking Memory for Multiagent Reinforcement Learning with Action Parsing
    Yang, Yaodong
    Chen, Guangyong
    Wang, Weixun
    Hao, Xiaotian
    Hao, Jianye
    Heng, Pheng Ann
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Human Parsing With Pyramidical Gather-Excite Context
    Zhang, Sanyi
    Qi, Guo-Jun
    Cao, Xiaochun
    Song, Zhanjie
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (03) : 1016 - 1030
  • [9] Knowledge enhanced multi-task learning for simultaneous optimization of human parsing and pose estimation
    Zhou, Yanghong
    Mok, P. Y.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [10] Relationship aware context adaptive deep learning for image parsing
    Azam, Basim
    Mandal, Ranju
    Verma, Brijesh
    INFORMATION SCIENCES, 2022, 607 : 506 - 518