Text-free diffusion inpainting using reference images for enhanced visual fidelity

被引:0
|
作者
Kim, Beomjo [1 ]
Sohn, Kyung-Ah [1 ]
机构
[1] Ajou Univ, Dept Artificial Intelligence, 206 World Cup Ro, Suwon 16499, Gyeonggi Do, South Korea
基金
新加坡国家研究基金会;
关键词
Diffusion models; Image generation; Image inpainting; Subject-driven generation; Image manipulation;
D O I
10.1016/j.patrec.2024.10.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel approach to subject-driven image generation that addresses the limitations of traditional text-to-image diffusion models. Our method generates images using reference images without relying on language-based prompts. We introduce a visual detail preserving module that captures intricate details and textures, addressing overfitting issues associated with limited training samples. The model's performance is further enhanced through a modified classifier-free guidance technique and feature concatenation, enabling the natural positioning and harmonization of subjects within diverse scenes. Quantitative assessments using CLIP, DINO and Quality scores (QS), along with a user study, demonstrate the superior quality of our generated images. Our work highlights the potential of pre-trained models and visual patch embeddings in subject-driven editing, balancing diversity and fidelity in image generation tasks. Our implementation is available at https://github.co m/8eomio/Subject-Inpainting.
引用
收藏
页码:221 / 228
页数:8
相关论文
共 50 条
  • [31] Detection of Crohn’s disease with diffusion images versus contrast-enhanced images in pediatric using MR enterography with histopathological correlation
    Gabriele Masselli
    Chiara De Vincentiis
    Marina Aloi
    Marianna Guida
    Roberta Cao
    Gaia Cartocci
    Vittorio Miele
    Roberto Grassi
    La radiologia medica, 2019, 124 : 1306 - 1314
  • [32] Detection of Crohn's disease with diffusion images versus contrast-enhanced images in pediatric using MR enterography with histopathological correlation
    Masselli, Gabriele
    De Vincentiis, Chiara
    Aloi, Marina
    Guida, Marianna
    Cao, Roberta
    Cartocci, Gaia
    Miele, Vittorio
    Grassi, Roberto
    RADIOLOGIA MEDICA, 2019, 124 (12): : 1306 - 1314
  • [33] Screen-rendered text images recognition using a deep residual network based segmentation-free method
    Xu, Xin
    Zhou, Jun
    Zhang, Hong
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2741 - 2746
  • [34] Reference free quality metric using a region-based attention model for JPEG-2000 compressed images
    Barland, Rerni
    Saadane, Abdelhakim
    IMAGE QUALITY AND SYSTEM PERFORMANCE III, 2006, 6059
  • [35] Enhanced Mass on Contrast-Enhanced Breast MR Imaging: Lesion Characterization Using Combination of Dynamic Contrast-Enhanced and Diffusion-Weighted MR Images
    Yabuuchi, Hidetake
    Matsuo, Yoshio
    Okafuji, Takashi
    Kamitani, Takeshi
    Soeda, Hiroyasu
    Setoguchi, Taro
    Sakai, Shuji
    Hatakenaka, Masamitsu
    Kubo, Makoto
    Sadanaga, Noriaki
    Yamamoto, Hidetaka
    Honda, Hiroshi
    JOURNAL OF MAGNETIC RESONANCE IMAGING, 2008, 28 (05) : 1157 - 1165
  • [36] Text/Image Region Separation for Document Layout Detection of Old Document Images using Non-linear Diffusion and Level Set
    Kumar, Sachin S.
    Rajendran, Parvathy
    Prabaharan, P.
    Soman, K. P.
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING AND COMMUNICATIONS, 2016, 93 : 469 - 477
  • [37] Diagnosis of Infectious Spondylitis Using Non-Contrast Enhanced MRI With Axial Diffusion-Weighted Images: Comparison With Gadolinium-Enhanced MRI
    Choi, Seung-Hoon
    Hwang, Jung-Min
    Lee, Seungeun
    Lee, So-Yeon
    Jung, Joon-Yong
    INVESTIGATIVE MAGNETIC RESONANCE IMAGING, 2023, 27 (02) : 75 - 83
  • [38] Event-based knowledge extraction from free-text descriptions for art images by using semantic role labeling approaches
    Lin, Chia-Hung
    Yen, Chia-Wei
    Hong, Jen-Shin
    Cruz-Lara, Samuel
    ELECTRONIC LIBRARY, 2008, 26 (02): : 215 - 225
  • [39] No-Reference Image Quality Assessment Algorithm for Contrast-Distorted Images Enhanced by using Directional Contrast Feature in Curvelet Domain
    Ahmed, Ismail T.
    Der, Chen Soong
    2017 IEEE 13TH INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & ITS APPLICATIONS (CSPA), 2017, : 61 - 66
  • [40] Enhanced self-harm presentation reporting using additional ICD-10 codes and free text in NSW emergency departments
    Sara, Grant E.
    Wu, Jianyun
    PUBLIC HEALTH RESEARCH & PRACTICE, 2023, 33 (03):