Fine-tuning Pipeline for Hand Image Generation Using Diffusion Model

被引:0
|
作者
Bai, Bingyuan [1 ]
Xie, Haoran [1 ]
Miyata, Kazunori [1 ]
机构
[1] Japan Adv Inst Sci & Technol JAIST, Nomi, Japan
关键词
text-to-image; hand inpainting; stable diffusion; ControlNet; LoRA;
D O I
10.1109/NICOInt62634.2024.00020
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The hand images generated by the image generative model may suffer distortions, such as stable diffusion. To solve this issue, we introduce a hand image fine-tuning pipeline consisting of three stages: hand detection, object masking, and image inpainting. First, a hand detection model is trained to identify flawed hands using bounding boxes (Bbox). Then, these Bbox regions are masked in conjunction with Mediapipe landmarks. Finally, a ControlNet model is trained for inpainting the masked areas, and the targeted LoRA is also trained to minimize boundary fragmentation. The results indicate that our method achieves better anatomical accuracy in hand reconstruction compared to the original diffusion model. Furthermore, the introduction of the directional LoRA model further enhances the evaluation outcomes.
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [41] Deep Image Aesthetics Classification using Inception Modules and Fine-tuning Connected Layer
    Jin, Xin
    Chi, Jingying
    Peng, Siwei
    Tian, Yulu
    Ye, Chaochen
    Li, Xiaodong
    2016 8TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS & SIGNAL PROCESSING (WCSP), 2016,
  • [42] Factorized Convolutional Networks: Unsupervised Fine-Tuning for Image Clustering
    Gui, Liang-Yan
    Gui, Liangke
    Wang, Yu-Xiong
    Morency, Louis-Philippe
    Moura, Jose M. F.
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1205 - 1214
  • [43] Fine-Tuning Next-Generation Genome Editing Tools
    Kanchiswamy, Chidananda Nagamangala
    Maffei, Massimo
    Malnoy, Mickael
    Velasco, Riccardo
    Kim, Jin-Soo
    TRENDS IN BIOTECHNOLOGY, 2016, 34 (07) : 562 - 574
  • [44] A new pipeline for generating instruction dataset via RAG and self fine-tuning
    Sung, Chih-Wei
    Lee, Yu-Kai
    Tsai, Yin-Te
    2024 IEEE 48TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC 2024, 2024, : 2308 - 2312
  • [45] Growing a Brain: Fine-Tuning by Increasing Model Capacity
    Wang, Yu-Xiong
    Ramanan, Deva
    Hebert, Martial
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3029 - 3038
  • [46] Tangent Model Composition for Ensembling and Continual Fine-tuning
    Liu, Tian Yu
    Soatto, Stefano
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18630 - 18640
  • [47] Comprehensive Review of Large Language Model Fine-Tuning
    Zhang, Qintong
    Wang, Yuchao
    Wang, Hexi
    Wang, Junxin
    Chen, Hai
    Computer Engineering and Applications, 2024, 60 (17) : 17 - 33
  • [48] Patent classification by fine-tuning BERT language model
    Lee, Jieh-Sheng
    Hsiang, Jieh
    WORLD PATENT INFORMATION, 2020, 61
  • [49] Inhomogenous cosmological model and fine-tuning of the initial state
    Sundell, Peter
    Vilja, Iiro
    MODERN PHYSICS LETTERS A, 2014, 29 (10)
  • [50] Knowledge Graph Fusion for Language Model Fine-Tuning
    Bhana, Nimesh
    van Zyl, Terence L.
    2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 167 - 172