InteractGAN: Learning to Generate Human-Object Interaction

被引:8
|
作者
Gao, Chen [1 ]
Liu, Si [2 ]
Zhu, Defa [1 ]
Liu, Quan [2 ]
Cao, Jie [3 ]
He, Haoqian [2 ]
He, Ran [3 ]
Yan, Shuicheng [4 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Beihang Univ, Beijing, Peoples R China
[3] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[4] Yitu Technol, Shanghai, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
HOI-IG; InteractGAN; Relation-based Transformation; Relation-based Image Merging; IMAGE SYNTHESIS;
D O I
10.1145/3394171.3413854
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compared with the widely studied Human-Object Interaction DE-Tection (HOI-DET), no effort has been devoted to its inverse problem, i.e. to generate an HOI scene image according to the given relationship triplet <human, predicate, object>, to our best knowledge. We term this new task "Human-Object Interaction Image Generation" (HOI-IG). HOI-IG is a research-worthy task with great application prospects, such as online shopping, film production and interactive entertainment. In this work, we introduce an Interact-GAN to solve this challenging task. Our method is composed of two stages: (1) manipulating the posture of a given human image conditioned on a predicate. (2) merging the transformed human image and object image to one realistic scene image while satisfying their expected relative position and ratio. Besides, to address the large spatial misalignment issue caused by fusing two images content with reasonable spatial layout, we propose a Relation-based Spatial Transformer Network (RSTN) to adaptively process the images conditioned on their interaction. Extensive experiments on two challenging datasets demonstrate the effectiveness and superiority of our approach. We advocate for the image generation community to draw more attention to the new Human-Object Interaction Image Generation problem. To facilitate future research, our project will be released at: http://colalab.org/projects/InteractGAN.
引用
收藏
页码:165 / 173
页数:9
相关论文
共 50 条
  • [1] Lifelong Learning for Human-Object Interaction Detection
    Sun, Bo
    Lu, Sixu
    He, Jun
    Yu, Lejun
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 582 - 587
  • [2] Learning Human-Object Interaction Detection using Interaction Points
    Wang, Tiancai
    Yang, Tong
    Danelljan, Martin
    Khan, Fahad Shahbaz
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4115 - 4124
  • [3] Relational Context Learning for Human-Object Interaction Detection
    Kim, Sanghyun
    Jung, Deunsol
    Cho, Minsu
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2925 - 2934
  • [4] Category Query Learning for Human-Object Interaction Classification
    Xie, Chi
    Zeng, Fangao
    Hu, Yue
    Liang, Shuang
    Wei, Yichen
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15275 - 15284
  • [5] Affordance Transfer Learning for Human-Object Interaction Detection
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 495 - 504
  • [6] Learning Asynchronous and Sparse Human-Object Interaction in Videos
    Morais, Romero
    Vuong Le
    Venkatesh, Svetha
    Truyen Tran
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16036 - 16045
  • [7] Detecting Human-Object Interaction via Fabricated Compositional Learning
    Hou, Zhi
    Yu, Baosheng
    Qiao, Yu
    Peng, Xiaojiang
    Tao, Dacheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14641 - 14650
  • [8] Human-Object Interaction Recognition by Learning the distances between the Object and the Skeleton Joints
    Meng, Meng
    Drira, Hassen
    Daoudi, Mohamed
    Boonaert, Jacques
    [J]. 2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 7, 2015,
  • [9] Learning Human-Object Interaction Detection via Deformable Transformer
    Cai, Shuang
    Ma, Shiwei
    Gu, Dongzhou
    [J]. 2021 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2021, 12076
  • [10] Learning Human-Object Interaction via Interactive Semantic Reasoning
    Yang, Dongming
    Zou, Yuexian
    Li, Zhu
    Li, Ge
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9294 - 9305