Query-Selected Global Attention for Text guided Image Style Transfer using Diffusion Model

被引:0
|
作者
Hwang, Jungmin [1 ]
Lee, Won-Sook [1 ]
机构
[1] Univ Ottawa, Fac Engn, Sch EECS, Ottawa, ON, Canada
关键词
Diffusion; Style Transfer; Query Selection; Global Attention;
D O I
10.1109/CAI59869.2024.00207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have gained tremendous interest in image generation. Additionally, guided text methods for manipulating source images have shown successful progress. However, research on style transfer using diffusion models is still ongoing to address the trade-off between style transfer and content preservation. One representative solution to the issue is contrastive learning in a self-supervised manner, which is useful for extracting specific features from the same location on source and generated images for every pixel. However, there are instances where it is necessary to preserve certain areas, which contain more information from the source image compared to other areas in the image. Therefore, we propose anchoring the areas for preservation and intentionally selecting features at the anchor points through a query-selected global attention method. This enables our method to generate an image that preserves the content of the source while transferring the style without the need for additional fine-tuning or auxiliary network. Our diffusion model follows a simple architecture to enhance image quality and speed up inference time, in comparison to other diffusion methods. Our experimental results also demonstrate superior performance.
引用
收藏
页码:1162 / 1166
页数:5
相关论文
共 50 条
  • [31] Attention-guided LiDAR segmentation and odometry using image-to-point cloud saliency transfer
    Ding, Guanqun
    Imamoglu, Nevrez
    Caglayan, Ali
    Murakawa, Masahiro
    Nakamura, Ryosuke
    MULTIMEDIA SYSTEMS, 2024, 30 (04)
  • [32] S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using StripsWindow Attention
    Zhang, Chiyu
    Xu, Xiaogang
    Wang, Lei
    Dai, Zaiyan
    Yang, Jun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7024 - 7032
  • [33] Pedestrian Gender Recognition by Style Transfer of Visible-Light Image to Infrared-Light Image Based on an Attention-Guided Generative Adversarial Network
    Baek, Na Rae
    Cho, Se Woon
    Koo, Ja Hyung
    Park, Kang Ryoung
    MATHEMATICS, 2021, 9 (20)
  • [34] FST-OAM: a fast style transfer model using optimized self-attention mechanism
    Du, Xiaozhi
    Jia, Ning
    Du, Hongyuan
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) : 4191 - 4203
  • [35] Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
    Kim, Dohyun
    Lee, Euna
    Yoo, Daehyun
    Lee, Hongchul
    IEEE ACCESS, 2024, 12 : 13912 - 13922
  • [36] Image and Text Aspect Level Multimodal Sentiment Classification Model Using Transformer and Multilayer Attention Interaction
    Yin, Xiuye
    Chen, Liyong
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2023, 19 (01) : 22 - 22
  • [37] A pan-sharpening model using dual-branch attention-guided diffusion networks
    Zheng, Huangqimei
    Pan, Chengyi
    Jin, Xin
    Wozniak, Michal
    Wang, Puming
    Lee, Shin-Jye
    Jiang, Qian
    INFORMATION FUSION, 2025, 120
  • [38] An Underwater Image Enhancement Method Based on Diffusion Model Using Dual-Layer Attention Mechanism
    Zhang, Hong
    He, Ran
    Fang, Wei
    WATER, 2024, 16 (13)
  • [39] Correction to: Personalized smile synthesis using attention-guided global parametric model and local non-parametric model
    Ching-Ting Tu
    Sung-Hsien Hsieh
    Kuan-Lin Chen
    Jenn-Jier James Lien
    Multimedia Tools and Applications, 2023, 82 : 21611 - 21611
  • [40] LayerDiff: Exploring Text-Guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
    Huang, Runhui
    Cai, Kaixin
    Hang, Jianhua
    Liang, Xiaodan
    Pei, Renjing
    Lu, Guansong
    Xu, Songcen
    Zhang, Wei
    Xu, Hang
    COMPUTER VISION - ECCV 2024, PT LXXVI, 2025, 15134 : 144 - 160