Enhanced Context Learning with Transformer for Human Parsing

被引:1
|
作者
Song, Jingya [1 ,2 ,3 ]
Shi, Qingxuan [1 ,2 ,3 ]
Li, Yihang [1 ,2 ,3 ]
Yang, Fang [1 ,2 ,3 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China
[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071002, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 15期
关键词
human parsing; semantic segmentation; deep learning; SEGMENTATION;
D O I
10.3390/app12157821
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human parsing is a fine-grained human semantic segmentation task in the field of computer vision. Due to the challenges of occlusion, diverse poses and a similar appearance of different body parts and clothing, human parsing requires more attention to learn context information. Based on this observation, we enhance the learning of global and local information to obtain more accurate human parsing results. In this paper, we introduce a Global Transformer Module (GTM) via a self-attention mechanism to capture long-range dependencies for effectively extracting context information. Moreover, we design a Detailed Feature Enhancement (DFE) architecture to exploit spatial semantics for small targets. The low-level visual features from CNN intermediate layers are enhanced by using channel and spatial attention. In addition, we adopt an edge detection module to refine the prediction. We conducted extensive experiments on three datasets (i.e., LIP, ATR, and Fashion Clothing) to show the effectiveness of our method, which achieves 54.55% mIoU on the LIP dataset, 80.26% on the average F-1 score on the ATR dataset and 55.19% on the average F-1 score on the Fashion Clothing dataset.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Enhancing human parsing with region-level learning
    Zhou, Yanghong
    Mok, P. Y.
    IET COMPUTER VISION, 2024, 18 (01) : 60 - 71
  • [22] Technology-Enhanced Learning of Human Trauma Biomechanics in an Interprofessional Student Context
    Moller, Hans
    Creutzfeldt, Johan
    Valeskog, Karin
    Rystedt, Hans
    Edelbring, Samuel
    Fahlstedt, Madelen
    Fellander-Tsai, Li
    Abrandt Dahlgren, Madeleine
    TEACHING AND LEARNING IN MEDICINE, 2022, 34 (02) : 135 - 144
  • [23] PARSING STRATEGIES AND DISCOURSE CONTEXT
    HOLMES, VM
    JOURNAL OF PSYCHOLINGUISTIC RESEARCH, 1984, 13 (03) : 237 - 257
  • [24] Online Structured Learning for Semantic Parsing with Synchronous and λ-Synchronous Context Free Grammars
    Nguyen, Le-Minh
    Shimazu, Akira
    Phan, Xuan Hieu
    Nguyen, Phuong Thai
    20TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL 2, PROCEEDINGS, 2008, : 135 - +
  • [25] Fashion Parsing With Video Context
    Liu, Si
    Liang, Xiaodan
    Liu, Luoqi
    Lu, Ke
    Lin, Liang
    Cao, Xiaochun
    Yan, Shuicheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (08) : 1347 - 1358
  • [26] CONTEXT-SENSITIVE PARSING
    WOODS, WA
    COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 437 - &
  • [27] OBSERVATIONS ON CONTEXT FREE PARSING
    SHEIL, BA
    STATISTICAL METHODS IN LINGUISTICS, 1976, : 71 - 109
  • [28] Two-Layer Context-Enhanced Representation for Better Chinese Discourse Parsing
    Zhu, Qiang
    Wang, Kedong
    Kong, Fang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 43 - 54
  • [29] Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing
    Guo, Daya
    Tang, Duyu
    Duan, Nan
    Zhou, Ming
    Yin, Jian
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 855 - 866
  • [30] Fashion Parsing with Video Context
    Liu, Si
    Liang, Xiaodan
    Liu, Luoqi
    Lu, Ke
    Lin, Liang
    Yan, Shuicheng
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 467 - 476