Dynamic Interaction Dilation for Interactive Human Parsing

被引:1
|
作者
Gao, Yutong [1 ]
Lang, Congyan [1 ]
Liu, Fayao [2 ]
Cao, Yuanzhouhan [3 ]
Sun, Lijuan [4 ]
Wei, Yunchao [5 ]
机构
[1] Beijing Jiaotong Univ, Minist Educ, Key Lab Big Data & Artificial Intelligence Transpo, Beijing 100044, Peoples R China
[2] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
[3] Beijing Jiaotong Univ, Sch Comp Sci & Informat Technol, Beijing 100044, Peoples R China
[4] Beijing Univ Posts & Telecommun, Sch Econ & Management, Minist Educ, Key Lab Trustworthy Distributed Comp & Serv, Beijing 100876, Peoples R China
[5] Beijing Jiaotong Univ, Inst Informat Sci, Beijing 10044, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Image edge detection; Annotations; Location awareness; Feature extraction; Transforms; Task analysis; Human parsing; interactive image segmentation; semantic image segmentation; IMAGE SEGMENTATION;
D O I
10.1109/TMM.2023.3262973
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Interactive segmentation pursues generating high-quality pixel-level predictions with a few user-provided clicks, which is gaining attention for its convenience in segmentation data annotation. Users are allowed to iteratively refine the prediction by adding clicks until the result is satisfactory. Existing interactive methods usually transform the clicks into a set of localization maps by Euclidian distance computation or RGB texture extraction to guide the segmentation, which makes the click transformation a core module in interactive segmentation networks. However, when adopted in human images where large poses, occlusions, and bad illuminations are prevailing, prior transformation methods tend to cause uncorrectable overlapping across localization maps which are difficult to form a good match among human parts. Furthermore, the inappropriately transformed information is hard to be refined with the static transformation manner which is out of tune with the dynamically refined interaction process. Hence, we design a dynamic transformation scheme for interactive human parsing (IHP) named Dynamic Interaction Dilation Net (DID-Net), which serves as an initial attempt to break the limitations of static transformation while capturing long-range dependencies of clicks within each human part. Specifically, we construct a Dynamic Dilation Module (DD-Module) to dilate clicks radially in several directions assisted by human body edge detection to refine the dilation quality in each interaction iteration. Furthermore, we propose an Adaptive Interaction Excitation Block (AIE-Block) to exploit potential semantic clues buried in the dilated clicks. Our DID-Net achieves state-of-the-art performance on 3 public human parsing benchmarks.
引用
收藏
页码:178 / 189
页数:12
相关论文
共 50 条
  • [1] Clicking Matters: Towards Interactive Human Parsing
    Gao, Yutong
    Liang, Liqian
    Lang, Congyan
    Feng, Songhe
    Li, Yidong
    Wei, Yunchao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 3190 - 3203
  • [2] LensingWikipedia: Parsing Text for the Interactive Visualization of Human History
    Vadlapudi, Ravikiran
    Siahbani, Maryam
    Sarkar, Anoop
    Dill, John
    2012 IEEE CONFERENCE ON VISUAL ANALYTICS SCIENCE AND TECHNOLOGY (VAST), 2012, : 247 - 248
  • [3] Spatial Parsing and Dynamic Temporal Pooling networks for Human-Object Interaction detection
    Li, Hongsheng
    Zhu, Guangming
    Zhen, Wu
    Ni, Lan
    Shen, Peiyi
    Zhang, Liang
    Wang, Ning
    Hua, Cong
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [4] Multimodal Interactive Parsing
    Benedi, Jose-Miguel
    Sanchez, Joan-Andreu
    Leiva, Luis A.
    Sanchez-Saez, Ricardo
    Maca, Mauricio
    PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2013, 2013, 7887 : 484 - 491
  • [5] On-Line Audio Dilation for Human Interaction
    Novak, John S., III
    Archer, Jason
    Shafiro, Valeriy
    Kenyon, Robert
    Leigh, Jason
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1868 - 1870
  • [6] Cascaded Parsing of Human-Object Interaction Recognition
    Zhou, Tianfei
    Qi, Siyuan
    Wang, Wenguan
    Shen, Jianbing
    Zhu, Song-Chun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (06) : 2827 - 2840
  • [7] INTERACTIVE INCREMENTAL CHART PARSING
    WIREN, M
    FOURTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 1989, : 241 - 248
  • [8] Multi-human Parsing Based on Dynamic Convolution
    Yan, Min
    Zhang, Guoshan
    Zhang, Tong
    Zhang, Yueming
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 7185 - 7190
  • [9] Interactive dynamic simulation using haptic interaction
    Son, W
    Kim, K
    Amato, NM
    Trinkle, JC
    2000 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2000), VOLS 1-3, PROCEEDINGS, 2000, : 145 - 150
  • [10] Learning Human Interaction by Interactive Phrases
    Kong, Yu
    Jia, Yunde
    Fu, Yun
    COMPUTER VISION - ECCV 2012, PT I, 2012, 7572 : 300 - 313