Multimodal supervised image translation

被引:3
|
作者
Ruan, Congcong [1 ]
Chen, Dihu [1 ]
Hu, Haifeng [1 ]
机构
[1] Sun Yat Sen Univ, Guangzhou, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
7;
D O I
10.1049/el.2018.6167
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multimodal image-to-image translation is a class of vision and graphics problems where the goal is to learn a one-to-many mapping between the source domain and target domain. Given an image in the source domain, the model aims to produce as many diverse results as possible. It is an important and challenging problem in the task of image translation. To this end, recent works utilise Gaussian vectors to produce diverse results but with a small difference. It is because of the special probabilistic nature of Gaussian distribution. In this work, the authors propose linearly distributed latent codes instead of conventional Gaussian vectors, which control the style of generated images. Taking advantage of linear distribution, their model can produce much more diverse results and outperform the state-of-the-art baselines in terms of diversity. Qualitative and quantitative comparisons against baselines demonstrate the effectiveness and superiority of their method.
引用
收藏
页码:190 / 191
页数:2
相关论文
共 50 条
  • [1] Multimodal and Multiclass Semi-supervised Image-to-Image Translation
    Bai, Jing
    Chen, Ran
    Ji, Hui
    Li, Saisai
    [J]. IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 503 - 514
  • [2] SEMI-SUPERVISED MULTIMODAL IMAGE TRANSLATION FOR MISSING MODALITY IMPUTATION
    Sun, Wangbin
    Ma, Fei
    Li, Yang
    Huang, Shao-Lun
    Ni, Shiguang
    Zhang, Lin
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4320 - 4324
  • [3] Multimodal Unsupervised Image-to-Image Translation
    Huang, Xun
    Liu, Ming-Yu
    Belongie, Serge
    Kautz, Jan
    [J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 179 - 196
  • [4] Toward Multimodal Image-to-Image Translation
    Zhu, Jun-Yan
    Zhang, Richard
    Pathak, Deepak
    Darrell, Trevor
    Efros, Alexei A.
    Wang, Oliver
    Shechtman, Eli
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [5] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
    Haralampieva, Veneta
    Caglayan, Ozan
    Specia, Lucia
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 74 : 1059 - 1089
  • [6] Supervised Visual Attention for Simultaneous Multimodal Machine Translation
    Haralampieva, Veneta
    Caglayan, Ozan
    Specia, Lucia
    [J]. Journal of Artificial Intelligence Research, 2022, 74 : 1059 - 1089
  • [7] Multimodal Pivots for Image Caption Translation
    Hitschler, Julian
    Schamoni, Shigehiko
    Riezler, Stefan
    [J]. PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 2399 - 2409
  • [8] Weakly Supervised GAN for Image-to-Image Translation in the Wild
    Cao, Zhiyi
    Niu, Shaozhang
    Zhang, Jiwei
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [9] Multimodal Structure-Consistent Image-to-Image Translation
    Lin, Che-Tsung
    Wu, Yen-Yi
    Hsu, Po-Hao
    Lai, Shang-Hong
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11490 - 11498
  • [10] MULTIMODAL IMAGE-TO-IMAGE TRANSLATION FOR GENERATION OF GASTRITIS IMAGES
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2466 - 2470