Toward Multi-Modal Conditioned Fashion Image Translation

被引：13

作者：

Gu, Xiaoling ^{[1
]}

Yu, Jun ^{[1
]}

Wong, Yongkang ^{[2
]}

Kankanhalli, Mohan S. ^{[2
]}

机构：

[1] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Key Lab Complex Syst Modeling & Simulat, Hangzhou 310018, Peoples R China

[2] Natl Univ Singapore, Sch Comp, Singapore 119613, Singapore

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2021年 / 23卷

基金：

美国国家科学基金会; 新加坡国家研究基金会;

关键词：

Generative adversarial network; fashion image synthesis; image-to-image translation; RETRIEVAL;

D O I：

10.1109/TMM.2020.3009500

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Having the capability to synthesize photo-realistic fashion product images conditioned on multiple attributes or modalities would bring many new exciting applications. In this work, we propose an end-to-end network architecture that built upon a new generative adversarial network for automatically synthesizing photo-realistic images of fashion products under multiple conditions. Given an input pose image that consists of a 2D skeleton pose and a sentence description of products, our model synthesizes a fashion image preserving the same pose and wearing the fashion products described as the text. Specifically, the generator G tries to generate realistic-looking fashion images based on a < pose, text > pair condition to fool the discriminator. An attention network is added for enhancing the generator, which predicts a probability map indicating which part of the image needs to be attended for translation. In contrast, the discriminator D distinguishes real images from the translated ones based on the input pose image and text description. The discriminator is divided into two multi-scale sub-discriminators for improving image distinguishing task. Quantitative and qualitative analysis demonstrates that our method is capable of synthesizing realistic images that retain the poses of given images while matching the semantics of provided sentence descriptions.

引用

页码：2361 / 2371

页数：11

共 50 条

[41] MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction
Cui, Jiaqi
Zeng, Xinyi
Zeng, Pinxian
Liu, Bo
Wu, Xi
Zhou, Jiliu
Wang, Yan
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VII, 2024, 15007 : 467 - 477
[42] The multi-modal universe of fast-fashion: the Visuelle 2.0 benchmark
Skenderi, Geri
Joppi, Christian
Denitto, Matteo
Scarpa, Berniero
Cristani, Marco
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2240 - 2245
[43] Guided Image Deblurring by Deep Multi-Modal Image Fusion
Liu, Yuqi
Sheng, Zehua
Shen, Hui-Liang
IEEE ACCESS, 2022, 10 : 130708 - 130718
[44] Principle-to-program: Neural Fashion Recommendation with Multi-modal Input
Chelliah, Muthusamy
Biswas, Soma
Dhakad, Lucky
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2706 - 2708
[45] Toward's Arabic Multi-modal Sentiment Analysis
Alqarafi, Abdulrahman S.
Adeel, Ahsan
Gogate, Mandar
Dashitpour, Kia
Hussain, Amir
Durrani, Tariq
COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2019, 463 : 2378 - 2386
[46] MM-FRec: Multi-Modal Enhanced Fashion Item Recommendation
Song, Xuemeng
Wang, Chun
Sun, Changchang
Feng, Shanshan
Zhou, Min
Nie, Liqiang
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (10) : 10072 - 10084
[47] Multi-modal and multi-vendor retina image registration
Li, Zhang
Huang, Fan
Zhang, Jiong
Dashtbozorg, Behdad
Abbasi-Sureshjani, Samaneh
Sun, Yue
Long, Xi
Yu, Qifeng
Romeny, Bart ter Haar
Tan, Tao
BIOMEDICAL OPTICS EXPRESS, 2018, 9 (02): : 410 - 422
[48] Robust Multi-Scale Multi-modal Image Registration
Holtzman-Gazit, Michal
Yavneh, Irad
SIGNAL PROCESSING, SENSOR FUSION, AND TARGET RECOGNITION XIX, 2010, 7697
[49] Efficient text-image semantic search: A multi-modal vision-language approach for fashion retrieval
Moro, Gianluca
Salvatori, Stefano
Frisoni, Giacomo
NEUROCOMPUTING, 2023, 538
[50] PRFusion: Toward Effective and Robust Multi-Modal Place Recognition With Image and Point Cloud Fusion
Wang, Sijie
Kang, Qiyu
She, Rui
Zhao, Kai
Song, Yang
Tay, Wee Peng
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 20523 - 20534

← 1 2 3 4 5 →