Controllable image generation based on causal representation learning

被引：2

作者：

Huang, Shanshan ^{[1
]}

Wang, Yuanhao ^{[1
]}

Gong, Zhili ^{[1
]}

Liao, Jun ^{[1
]}

Wang, Shu ^{[2
]}

Liu, Li ^{[1
]}

机构：

[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 401331, Peoples R China

[2] Southwest Univ, Sch Mat & Energy, Chongqing 400715, Peoples R China

来源：

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING | 2024年 / 25卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Image generation; Controllable image editing; Causal structure learning; Causal representation learning; MODEL;

D O I：

10.1631/FITEE.2300303

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image's latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.

引用

页码：135 / 148

页数：14

共 50 条

[41] BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing
Li, Dongxu
Li, Junnan
Hoi, Steven C. H.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[42] Learning Parts-Based and Global Representation for Image Classification
Lu, Yuwu
Lai, Zhihui
Li, Xuelong
Zhang, David
Wong, Wai Keung
Yuan, Chun
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (12) : 3345 - 3360
[43] Deep Representation Learning for Image-Based Cell Profiling
Wei, Wenzhao
Haidinger, Sacha
Lock, John
Meijering, Erik
MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2021, 2021, 12966 : 487 - 497
[44] Dictionary Learning for Image Coding Based on Multisample Sparse Representation
Sun, Yipeng
Tao, Xiaoming
Li, Yang
Lu, Jianhua
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (11) : 2004 - 2010
[45] Sparse Representation of Robot Image Based on Dictionary Learning Algorithm
Guo J.-F.
Li Y.-L.
Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (04): : 820 - 830
[46] Duplicate Image Representation Based on Semi-Supervised Learning
Chen, Ming
Yan, Jinghua
Gao, Tieliang
Li, Yuhua
Ma, Huan
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2022, 14 (01)
[47] A Histopathological Image Feature Representation Method based on Deep Learning
Zhang, Gang
Zhong, Ling
Huang, Yonghui
Zhang, Yi
2015 7TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME), 2015, : 13 - 17
[48] Image Sparse Representation Based on Ensemble Learning in Compressed Sensing
Bao, Donghai
Wang, Qingpei
Ding, Jiajun
Li, Sheng
He, Xiongxiong
2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,
[49] HVS BASED DICTIONARY LEARNING FOR SCALABLE SPARSE IMAGE REPRESENTATION
Begovic, Bojana
Stankovic, Vladimir
Stankovic, Lina
Cheng, Samuel
2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 1669 - 1673
[50] Learning-based Image Representation and Method for Face Recognition
Liu, Zhiming
Liu, Chengjun
Tao, Qingchuan
2009 IEEE 3RD INTERNATIONAL CONFERENCE ON BIOMETRICS: THEORY, APPLICATIONS AND SYSTEMS, 2009, : 283 - +

← 1 2 3 4 5 →