Text-free diffusion inpainting using reference images for enhanced visual fidelity

被引:0
|
作者
Kim, Beomjo [1 ]
Sohn, Kyung-Ah [1 ]
机构
[1] Ajou Univ, Dept Artificial Intelligence, 206 World Cup Ro, Suwon 16499, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Diffusion models; Image generation; Image inpainting; Subject-driven generation; Image manipulation;
D O I
10.1016/j.patrec.2024.10.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel approach to subject-driven image generation that addresses the limitations of traditional text-to-image diffusion models. Our method generates images using reference images without relying on language-based prompts. We introduce a visual detail preserving module that captures intricate details and textures, addressing overfitting issues associated with limited training samples. The model's performance is further enhanced through a modified classifier-free guidance technique and feature concatenation, enabling the natural positioning and harmonization of subjects within diverse scenes. Quantitative assessments using CLIP, DINO and Quality scores (QS), along with a user study, demonstrate the superior quality of our generated images. Our work highlights the potential of pre-trained models and visual patch embeddings in subject-driven editing, balancing diversity and fidelity in image generation tasks. Our implementation is available at https://github.co m/8eomio/Subject-Inpainting.
引用
收藏
页码:221 / 228
页数:8
相关论文
共 50 条
  • [21] Edge-enhanced error diffusion halftoning using human visual properties
    Kwak, Nae-Joung
    Ryu, Soung-Pil
    Ahn, Jae-Hyeong
    2006 International Conference on Hybrid Information Technology, Vol 1, Proceedings, 2006, : 499 - 504
  • [22] MTBI Identification From Diffusion MR Images Using Bag of Adversarial Visual Features
    Minaee, Shervin
    Wang, Yao
    Aygar, Alp
    Chung, Sohae
    Wang, Xiuyuan
    Lui, Yvonne W.
    Fieremans, Els
    Flanagan, Steven
    Rath, Joseph
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (11) : 2545 - 2555
  • [23] Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
    Yu, Cuican
    Lu, Guansong
    Zeng, Yihan
    Sun, Jian
    Liang, Xiaodan
    Li, Huibin
    Xu, Zongben
    Xu, Songcen
    Zhang, Wei
    Xu, Hang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15280 - 15291
  • [24] Reference-Free Isotropic 3D EM Reconstruction Using Diffusion Models
    Lee, Kyungryun
    Jeong, Won-Ki
    DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 : 235 - 245
  • [25] Script-free text line segmentation using interline space model for printed document images
    Kim, Minwoo
    Oh, Il-Seok
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1354 - 1358
  • [26] Super-Resolution for Land Surface Temperature Retrieval Images via Cross-Scale Diffusion Model Using Reference Images
    Chen, Junqi
    Jia, Lijuan
    Zhang, Jinchuan
    Feng, Yilong
    Zhao, Xiaobin
    Tao, Ran
    REMOTE SENSING, 2024, 16 (08)
  • [27] Reference-Free Axial Super-Resolution of 3D Microscopy Images Using Implicit Neural Representation with a 2D Diffusion Prior
    Lee, Kyungryun
    Jeong, Won-Ki
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VII, 2024, 15007 : 593 - 602
  • [28] REFERENCE-FREE DESPECKLING OF SYNTHETIC-APERTURE RADAR IMAGES USING A DEEP CONVOLUTIONAL NETWORK
    Davis, T.
    Jain, V
    Ley, A.
    D'Hondt, O.
    Valade, S.
    Hellwich, O.
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 3908 - 3911
  • [29] No-Reference Quality Assessment for Screen Content Images Using Visual Edge Model and AdaBoosting Neural Network
    Yang, Jiachen
    Bian, Zilin
    Liu, Jiacheng
    Jiang, Bin
    Lu, Wen
    Gao, Xinbo
    Song, Houbing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6801 - 6814
  • [30] GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View Diffusion
    Ukarapol, Trapoom
    Pruvost, Kevin
    arXiv,