Prompt-Based Learning for Image Variation Using Single Image Multi-Scale Diffusion Models

被引:0
|
作者
Park, Jiwon [1 ]
Jeong, Dasol [2 ]
Lee, Hyebean [2 ]
Han, Seunghee [2 ]
Paik, Joonki [1 ,2 ]
机构
[1] Chung Ang Univ, Dept Artificial Intelligence, Seoul 06974, South Korea
[2] Chung Ang Univ, Dept Image, Seoul 06974, South Korea
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Training; Computational modeling; Periodic structures; Diffusion models; Data models; Image synthesis; Adaptation models; Noise reduction; Feature extraction; Context modeling; Single image generation; prompt-based learning; text guided image editing;
D O I
10.1109/ACCESS.2024.3487215
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a novel technique for a multi-scale framework with text-based learning using a single image to perform variations and text-based editing of the input image. Our approach captures the detailed internal information of a single image, enabling numerous variations while preserving the original features. In addition, text-conditioned learning provides a method to combine text and images to effectively perform text-based editing based on a single image. We propose a technique that integrates the diffusion U-Net structure within a multi-scale framework to accurately capture the quality and internal structure of an image from a single image and perform diverse variations while maintaining the features of the original image. Additionally, we utilized a pre-trained Bootstrapped Language-Image Pretraining (BLIP) model to generate various prompts for effective text-based editing, and we fed the prompts that most closely resembled the input image into the training process using Contrastive Language-Image Pretraining (CLIP)'s prior knowledge. To improve accuracy during the image editing stage, we designed a contrastive loss function to enhance the relevance between the prompt and the image. As a result, we improved the performance of learning between text and images, and through various experiments, we demonstrated its effectiveness on text-based image editing tasks. Our experiments show that the proposed method significantly improves the performance of single-image-based generative models and presents new possibilities in the field of text-based image editing.
引用
收藏
页码:158810 / 158823
页数:14
相关论文
共 50 条
  • [11] Prompt-Based Tuning of Transformer Models for Multi-Center Medical Image Segmentation of Head and Neck Cancer
    Saeed, Numan
    Ridzuan, Muhammad
    Majzoub, Roba Al
    Yaqub, Mohammad
    BIOENGINEERING-BASEL, 2023, 10 (07):
  • [12] Single Fog Image Restoration via Multi-scale Image Fusion
    Gao, Yin
    Su, Yijing
    Li, Qiming
    Li, Jun
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1873 - 1878
  • [13] MULTI-SCALE BLOCKS BASED IMAGE EMOTION CLASSIFICATION USING MULTIPLE INSTANCE LEARNING
    Rao, Tianrong
    Xu, Min
    Liu, Huiying
    Wang, Jinqiao
    Burnett, Ian
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 634 - 638
  • [14] Automatic Image Enhancement Based On Multi-scale Image Decomposition
    Feng, Lu
    Wu, Zhuangzhi
    Pei, Luo
    Long, Xiong
    FIFTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2013), 2014, 9069
  • [15] Deep learning-based compressed image artifacts reduction based on multi-scale image fusion
    Yeh, Chia-Hung
    Lin, Chu-Han
    Lin, Min-Hui
    Kang, Li-Wei
    Huang, Chih-Hsiang
    Chen, Mei-Juan
    INFORMATION FUSION, 2021, 67 (195-207) : 195 - 207
  • [16] Multi-Scale Single Image Dehazing Using Laplacian and Gaussian Pyramids
    Li, Zhengguo
    Shu, Haiyan
    Zheng, Chaobing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 9270 - 9279
  • [17] Image Compressed Sensing Using Multi-Scale Characteristic Residual Learning
    Yang, Shumian
    Xiang, Xinxin
    Tong, Fenghua
    Zhao, Dawei
    Li, Xin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1595 - 1600
  • [18] Image manipulation detection and localization using multi-scale contrastive learning
    Bai, Ruyi
    APPLIED SOFT COMPUTING, 2024, 163
  • [19] Underwater Image Enhancement Based on Multi-Scale Attention and Contrast Learning
    Wang Yue
    Fan Huijie
    Liu Shiben
    Tang Yandong
    LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (04)
  • [20] Automated Regularization Parameter Selection in Multi-Scale Total Variation Models for Image Restoration
    Dong, Yiqiu
    Hintermueller, Michael
    Rincon-Camacho, M. Monserrat
    JOURNAL OF MATHEMATICAL IMAGING AND VISION, 2011, 40 (01) : 82 - 104