High-Quality Text-to-Image Generation Using High-Detail Feature-Preserving Network

被引:0
|
作者
Hsu, Wei-Yen [1 ,2 ,3 ]
Lin, Jing-Wen [1 ]
机构
[1] Natl Chung Cheng Univ, Dept Informat Management, Chiayi 62102, Taiwan
[2] Natl Chung Cheng Univ, Adv Inst Mfg High Tech Innovat, Chiayi 62102, Taiwan
[3] Natl Chung Cheng Univ, Ctr Innovat Res Aging Society CIRAS, Chiayi 62102, Taiwan
来源
APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 02期
关键词
generative adversarial network; text-to-image generation; high detail; feature preservation;
D O I
10.3390/app15020706
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Multistage text-to-image generation algorithms have shown remarkable success. However, the images produced often lack detail and suffer from feature loss. This is because these methods mainly focus on extracting features from images and text, using only conventional residual blocks for post-extraction feature processing. This results in the loss of features, greatly reducing the quality of the generated images and necessitating more resources for feature calculation, which will severely limit the use and application of optical devices such as cameras and smartphones. To address these issues, the novel High-Detail Feature-Preserving Network (HDFpNet) is proposed to effectively generate high-quality, near-realistic images from text descriptions. The initial text-to-image generation (iT2IG) module is used to generate initial feature maps to avoid feature loss. Next, the fast excitation-and-squeeze feature extraction (FESFE) module is proposed to recursively generate high-detail and feature-preserving images with lower computational costs through three steps: channel excitation (CE), fast feature extraction (FFE), and channel squeeze (CS). Finally, the channel attention (CA) mechanism further enriches the feature details. Compared with the state of the art, experimental results obtained on the CUB-Bird and MS-COCO datasets demonstrate that the proposed HDFpNet achieves better performance and visual presentation, especially regarding high-detail images and feature preservation.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Adaptive and Feature-Preserving Subdivision for High-Quality Tetrahedral Meshes
    Burkhart, D.
    Hamann, B.
    Umlauf, G.
    COMPUTER GRAPHICS FORUM, 2010, 29 (01) : 117 - 127
  • [2] Generation of High-Quality Image Using Generative Adversarial Network
    Sun, Yitao
    2ND INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING, AND INTELLIGENT COMPUTING (CAMMIC 2022), 2022, 12259
  • [3] A Multimodal Fusion Generation Network for High-quality MR Image Synthesis
    Zhu, Rui
    Yang, Yidan
    Li, Jiayao
    Sun, Ruizhi
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2424 - 2429
  • [4] High-quality rendered image generation of isosurface
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao, 5 (333):
  • [5] High-Quality Panoramic Image Generation Using Multiple PAL Images
    Shibata, Keiji
    Araki, Satoshi
    Maeda, Kei
    Horita, Yuukou
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2014, 97 (06) : 58 - 66
  • [6] LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation
    Deng, Zijun
    He, Xiangteng
    Peng, Yuxin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [7] DPNet: Detail-preserving network for high quality monocular depth estimation
    Ye, Xinchen
    Chen, Shude
    Xu, Rui
    PATTERN RECOGNITION, 2021, 109
  • [8] Unsupervised detail-preserving network for high quality monocular depth estimation
    Zhang, Mingliang
    Ye, Xinchen
    Fan, Xin
    NEUROCOMPUTING, 2020, 404 : 1 - 13
  • [9] SHYI: Action Support for Contrastive Learning in High-Fidelity Text-to-Image Generation
    Xia, Tianxiang
    Xiao, Lin
    Montorfani, Yannick
    Pavia, Francesco
    Simsar, Enis
    Hofmann, Thomas
    arXiv,
  • [10] Privacy-Preserving High-Quality Map Generation with Participatory Sensing
    Chen, Xi
    Wu, Xiaopei
    Li, Xiang-Yang
    He, Yuan
    Liu, Yunhao
    2014 PROCEEDINGS IEEE INFOCOM, 2014, : 2310 - 2318