Enhanced Context Learning with Transformer for Human Parsing

被引:1
|
作者
Song, Jingya [1 ,2 ,3 ]
Shi, Qingxuan [1 ,2 ,3 ]
Li, Yihang [1 ,2 ,3 ]
Yang, Fang [1 ,2 ,3 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China
[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期
关键词
human parsing; semantic segmentation; deep learning; SEGMENTATION;
D O I
10.3390/app12157821
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Enhanced learning in the parish context: a learning community approach
    Littleton, John
    PRACTICAL THEOLOGY, 2018, 11 (04) : 320 - 333
  • [32] Correlating Edge with Parsing for Human Parsing
    Gong, Kai
    Wang, Xiuying
    Tan, Shoubiao
    ELECTRONICS, 2023, 12 (04)
  • [33] Graphonomy: Universal Human Parsing via Graph Transfer Learning
    Gong, Ke
    Gao, Yiming
    Liang, Xiaodan
    Shen, Xiaohui
    Wang, Meng
    Lin, Liang
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7442 - 7451
  • [34] Learning rebalanced human parsing model from imbalanced datasets
    Huang, Enbo
    Su, Zhuo
    Zhou, Fan
    Wang, Ruomei
    IMAGE AND VISION COMPUTING, 2020, 99
  • [35] Action Quality Assessment with Temporal Parsing Transformer
    Bai, Yang
    Zhou, Desen
    Zhang, Songyang
    Wang, Jian
    Ding, Errui
    Guan, Yu
    Long, Yang
    Wang, Jingdong
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 422 - 438
  • [36] Object Part Parsing with Hierarchical Dual Transformer
    Chen, Jiamin
    Si, Jianlou
    Liu, Naihao
    Wu, Yao
    Niu, Li
    Qian, Chen
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2016 - 2024
  • [37] AMR Parsing with Action-Pointer Transformer
    Zhou, Jiawei
    Naseem, Tahira
    Astudillo, Ramon Fernandez
    Florian, Radu
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5585 - 5598
  • [38] TERL: Transformer Enhanced Reinforcement Learning for Relation Extraction
    Wang, Yashen
    Shi, Tuo
    Ouyang, Xiaoye
    Guo, Dayu
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 192 - 206
  • [39] HUMAN PARSING
    KLIX, F
    COGNITION IN INDIVIDUAL AND SOCIAL CONTEXTS, 1989, : 155 - 163
  • [40] Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation
    Nie, Xuecheng
    Feng, Jiashi
    Yan, Shuicheng
    COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 519 - 534