Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model

被引:18
|
作者
Yang, Shiyuan [1 ]
Chen, Xiaodong [2 ]
Liao, Jing [1 ]
机构
[1] City Univ Hong Kong, Hong Kong, Peoples R China
[2] Tianjin Univ, Tianjin, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
关键词
Image Inpainting; Diffusion Model; Multimodal;
D O I
10.1145/3581783.3612200
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting. However, in practice, users often require more control over the inpainting process beyond textual guidance, especially when they want to composite objects with customized appearance, color, shape, and layout. Unfortunately, existing diffusion-based inpainting methods are limited to single-modal guidance and require task-specific training, hindering their cross-modal scalability. To address these limitations, we propose Uni-paint, a unified framework for multi-modal inpainting that offers various modes of guidance, including unconditional, text-driven, stroke-driven, exemplar-driven inpainting, as well as a combination of these modes. Furthermore, our Uni-paint is based on pretrained Stable Diffusion and does not require task-specific training on specific datasets, enabling few-shot generalizability to customized images. We have conducted extensive qualitative and quantitative evaluations that show our approach achieves comparable results to existing single-modal methods while offering multimodal inpainting capabilities not available in other methods. Code is available at https://github.com/ysy31415/unipaint.
引用
收藏
页码:3190 / 3199
页数:10
相关论文
共 40 条
  • [21] UniFRD: A Unified Method for Facial Image Restoration Based on Diffusion Probabilistic Model
    Jian, Muwei
    Wang, Rui
    Yu, Xiaoyang
    Xu, Feng
    Yu, Hui
    Lam, Kin-Man
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13494 - 13506
  • [22] Application of curvature driven diffusion model in lateral multi-lens video logging image inpainting
    Hu Hongtao
    Tong Xiao
    PROCEEDINGS OF 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS (ICEMI), VOL. 3, 2015, : 1167 - 1171
  • [23] RSHazeDiff: A Unified Fourier-Aware Diffusion Model for Remote Sensing Image Dehazing
    Xiong, Jiamei
    Yan, Xuefeng
    Wang, Yongzhen
    Zhao, Wei
    Zhang, Xiao-Ping
    Wei, Mingqiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 26 (01) : 1055 - 1070
  • [24] Image inpainting algorithm based on double curvature-driven diffusion model with P-Laplace operator
    Xiao, Lifang
    Wu, Jianhao
    PLOS ONE, 2024, 19 (07):
  • [25] A unified learning framework for content based medical image retrieval using a statistical model
    Seetharaman, K.
    Sathiamoorthy, S.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2016, 28 (01) : 110 - 124
  • [26] A unified Bayesian mixture model framework via spatial information for grayscale image segmentation
    Xiong, Taisong
    Huang, Yuanyuan
    Gou, Jianping
    Hu, Jinrong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2016, 40 : 345 - 356
  • [27] Multimodal feature-guided diffusion model for low-count PET image denoising
    Lin, Gengjia
    Jin, Yuxi
    Huang, Zhenxing
    Chen, Zixiang
    Liu, Haizhou
    Zhou, Chao
    Zhang, Xu
    Fan, Wei
    Zhang, Na
    Liang, Dong
    Cao, Peng
    Hu, Zhanli
    MEDICAL PHYSICS, 2025,
  • [28] VDMUFusion: A Versatile Diffusion Model-Based Unsupervised Framework for Image Fusion
    Shi, Yu
    Liu, Yu
    Cheng, Juan
    Wang, Z. Jane
    Chen, Xun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 441 - 454
  • [29] Image Compression and Decompression Framework Based on Latent Diffusion Model for Breast Mammography
    Hwang, InChan
    Woo, MinJae
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 1338 - 1343
  • [30] FloorDiffusion: Diffusion model-based conditional floorplan image generation method using parameter-efficient fine-tuning and image inpainting
    Shim, Jonghwa
    Moon, Jaeuk
    Kim, Hyeonwoo
    Hwang, Eenjun
    JOURNAL OF BUILDING ENGINEERING, 2024, 95