Query-Selected Global Attention for Text guided Image Style Transfer using Diffusion Model

被引：0

作者：

Hwang, Jungmin ^{[1
]}

Lee, Won-Sook ^{[1
]}

机构：

[1] Univ Ottawa, Fac Engn, Sch EECS, Ottawa, ON, Canada

来源：

2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024 | 2024年

关键词：

Diffusion; Style Transfer; Query Selection; Global Attention;

D O I：

10.1109/CAI59869.2024.00207

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Diffusion models have gained tremendous interest in image generation. Additionally, guided text methods for manipulating source images have shown successful progress. However, research on style transfer using diffusion models is still ongoing to address the trade-off between style transfer and content preservation. One representative solution to the issue is contrastive learning in a self-supervised manner, which is useful for extracting specific features from the same location on source and generated images for every pixel. However, there are instances where it is necessary to preserve certain areas, which contain more information from the source image compared to other areas in the image. Therefore, we propose anchoring the areas for preservation and intentionally selecting features at the anchor points through a query-selected global attention method. This enables our method to generate an image that preserves the content of the source while transferring the style without the need for additional fine-tuning or auxiliary network. Our diffusion model follows a simple architecture to enhance image quality and speed up inference time, in comparison to other diffusion methods. Our experimental results also demonstrate superior performance.

引用

页码：1162 / 1166

页数：5

共 50 条

[31] Attention-guided LiDAR segmentation and odometry using image-to-point cloud saliency transfer
Ding, Guanqun
Imamoglu, Nevrez
Caglayan, Ali
Murakawa, Masahiro
Nakamura, Ryosuke
MULTIMEDIA SYSTEMS, 2024, 30 (04)
[32] S2WAT: Image Style Transfer via Hierarchical Vision Transformer Using StripsWindow Attention
Zhang, Chiyu
Xu, Xiaogang
Wang, Lei
Dai, Zaiyan
Yang, Jun
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7024 - 7032
[33] Pedestrian Gender Recognition by Style Transfer of Visible-Light Image to Infrared-Light Image Based on an Attention-Guided Generative Adversarial Network
Baek, Na Rae
Cho, Se Woon
Koo, Ja Hyung
Park, Kang Ryoung
MATHEMATICS, 2021, 9 (20)
[34] FST-OAM: a fast style transfer model using optimized self-attention mechanism
Du, Xiaozhi
Jia, Ning
Du, Hongyuan
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (05) : 4191 - 4203
[35] Fine-Grained Human Hair Segmentation Using a Text-to-Image Diffusion Model
Kim, Dohyun
Lee, Euna
Yoo, Daehyun
Lee, Hongchul
IEEE ACCESS, 2024, 12 : 13912 - 13922
[36] Image and Text Aspect Level Multimodal Sentiment Classification Model Using Transformer and Multilayer Attention Interaction
Yin, Xiuye
Chen, Liyong
INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2023, 19 (01) : 22 - 22
[37] A pan-sharpening model using dual-branch attention-guided diffusion networks
Zheng, Huangqimei
Pan, Chengyi
Jin, Xin
Wozniak, Michal
Wang, Puming
Lee, Shin-Jye
Jiang, Qian
INFORMATION FUSION, 2025, 120
[38] An Underwater Image Enhancement Method Based on Diffusion Model Using Dual-Layer Attention Mechanism
Zhang, Hong
He, Ran
Fang, Wei
WATER, 2024, 16 (13)
[39] Correction to: Personalized smile synthesis using attention-guided global parametric model and local non-parametric model
Ching-Ting Tu
Sung-Hsien Hsieh
Kuan-Lin Chen
Jenn-Jier James Lien
Multimedia Tools and Applications, 2023, 82 : 21611 - 21611
[40] LayerDiff: Exploring Text-Guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
Huang, Runhui
Cai, Kaixin
Hang, Jianhua
Liang, Xiaodan
Pei, Renjing
Lu, Guansong
Xu, Songcen
Zhang, Wei
Xu, Hang
COMPUTER VISION - ECCV 2024, PT LXXVI, 2025, 15134 : 144 - 160

← 1 2 3 4 5 →