High-Quality Text-to-Image Generation Using High-Detail Feature-Preserving Network

被引：0

作者：

Hsu, Wei-Yen ^{[1
,2
,3
]}

Lin, Jing-Wen ^{[1
]}

机构：

[1] Natl Chung Cheng Univ, Dept Informat Management, Chiayi 62102, Taiwan

[2] Natl Chung Cheng Univ, Adv Inst Mfg High Tech Innovat, Chiayi 62102, Taiwan

[3] Natl Chung Cheng Univ, Ctr Innovat Res Aging Society CIRAS, Chiayi 62102, Taiwan

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 02期

关键词：

generative adversarial network; text-to-image generation; high detail; feature preservation;

D O I：

10.3390/app15020706

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Multistage text-to-image generation algorithms have shown remarkable success. However, the images produced often lack detail and suffer from feature loss. This is because these methods mainly focus on extracting features from images and text, using only conventional residual blocks for post-extraction feature processing. This results in the loss of features, greatly reducing the quality of the generated images and necessitating more resources for feature calculation, which will severely limit the use and application of optical devices such as cameras and smartphones. To address these issues, the novel High-Detail Feature-Preserving Network (HDFpNet) is proposed to effectively generate high-quality, near-realistic images from text descriptions. The initial text-to-image generation (iT2IG) module is used to generate initial feature maps to avoid feature loss. Next, the fast excitation-and-squeeze feature extraction (FESFE) module is proposed to recursively generate high-detail and feature-preserving images with lower computational costs through three steps: channel excitation (CE), fast feature extraction (FFE), and channel squeeze (CS). Finally, the channel attention (CA) mechanism further enriches the feature details. Compared with the state of the art, experimental results obtained on the CUB-Bird and MS-COCO datasets demonstrate that the proposed HDFpNet achieves better performance and visual presentation, especially regarding high-detail images and feature preservation.

引用

页数：16

共 50 条

[1] Adaptive and Feature-Preserving Subdivision for High-Quality Tetrahedral Meshes
Burkhart, D.
Hamann, B.
Umlauf, G.
COMPUTER GRAPHICS FORUM, 2010, 29 (01) : 117 - 127
[2] Generation of High-Quality Image Using Generative Adversarial Network
Sun, Yitao
2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
[3] A Multimodal Fusion Generation Network for High-quality MR Image Synthesis
Zhu, Rui
Yang, Yidan
Li, Jiayao
Sun, Ruizhi
PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2424 - 2429
[4] High-quality rendered image generation of isosurface
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 5 (333):
[5] High-Quality Panoramic Image Generation Using Multiple PAL Images
Shibata, Keiji
Araki, Satoshi
Maeda, Kei
Horita, Yuukou
ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2014, 97 (06) : 58 - 66
[6] LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation
Deng, Zijun
He, Xiangteng
Peng, Yuxin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
[7] DPNet: Detail-preserving network for high quality monocular depth estimation
Ye, Xinchen
Chen, Shude
Xu, Rui
PATTERN RECOGNITION, 2021, 109
[8] Unsupervised detail-preserving network for high quality monocular depth estimation
Zhang, Mingliang
Ye, Xinchen
Fan, Xin
NEUROCOMPUTING, 2020, 404 : 1 - 13
[9] SHYI: Action Support for Contrastive Learning in High-Fidelity Text-to-Image Generation
Xia, Tianxiang
Xiao, Lin
Montorfani, Yannick
Pavia, Francesco
Simsar, Enis
Hofmann, Thomas
arXiv,
[10] Privacy-Preserving High-Quality Map Generation with Participatory Sensing
Chen, Xi
Wu, Xiaopei
Li, Xiang-Yang
He, Yuan
Liu, Yunhao
2014 PROCEEDINGS IEEE INFOCOM, 2014, : 2310 - 2318

← 1 2 3 4 5 →