Fine-tuning Pipeline for Hand Image Generation Using Diffusion Model

被引:0
|
作者
Bai, Bingyuan [1 ]
Xie, Haoran [1 ]
Miyata, Kazunori [1 ]
机构
[1] Japan Adv Inst Sci & Technol JAIST, Nomi, Japan
关键词
text-to-image; hand inpainting; stable diffusion; ControlNet; LoRA;
D O I
10.1109/NICOInt62634.2024.00020
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The hand images generated by the image generative model may suffer distortions, such as stable diffusion. To solve this issue, we introduce a hand image fine-tuning pipeline consisting of three stages: hand detection, object masking, and image inpainting. First, a hand detection model is trained to identify flawed hands using bounding boxes (Bbox). Then, these Bbox regions are masked in conjunction with Mediapipe landmarks. Finally, a ControlNet model is trained for inpainting the masked areas, and the targeted LoRA is also trained to minimize boundary fragmentation. The results indicate that our method achieves better anatomical accuracy in hand reconstruction compared to the original diffusion model. Furthermore, the introduction of the directional LoRA model further enhances the evaluation outcomes.
引用
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [1] FloorDiffusion: Diffusion model-based conditional floorplan image generation method using parameter-efficient fine-tuning and image inpainting
    Shim, Jonghwa
    Moon, Jaeuk
    Kim, Hyeonwoo
    Hwang, Eenjun
    JOURNAL OF BUILDING ENGINEERING, 2024, 95
  • [2] Using Diffusion Models for Dataset Generation: Prompt Engineering vs. Fine-Tuning
    Voetman, Roy
    van Meekeren, Alexander
    Aghaei, Maya
    Dijkstra, Klaas
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT I, 2023, 14184 : 143 - 153
  • [3] Fine-tuning Image Transformers using Learnable Memory
    Sandler, Mark
    Zhmoginov, Andrey
    Vladymyrov, Max
    Jackson, Andrew
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12145 - 12154
  • [4] Performance Evaluation of LWIR Image Detection Using Fine-tuning of YOLOX Model
    Bae, Jaehyun
    Kang, Byung-Jin
    Kim, Daehyeon
    Baek, Kyounghoon
    Journal of Institute of Control, Robotics and Systems, 2024, 30 (07) : 685 - 690
  • [5] Vegetable Image Retrieval with Fine-tuning VGG Model and Image Hash
    Yang, Zhaolu
    Yue, Jun
    Li, Zhenbo
    Zhu, Ling
    IFAC PAPERSONLINE, 2018, 51 (17): : 280 - 285
  • [6] Memory-Efficient Fine-Tuning for Quantized Diffusion Model
    Ryu, Hyogon
    Lim, Seohyun
    Shim, Hyunjung
    COMPUTER VISION - ECCV 2024, PT XVI, 2025, 15074 : 356 - 372
  • [7] DEFT: Dexterous Fine-Tuning for Hand Policies
    Kannan, Aditya
    Shaw, Kenneth
    Bahl, Shikhar
    Mannam, Pragna
    Pathak, Deepak
    CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229
  • [8] Fine-Tuning DARTS for Image Classification
    Tanveer, Muhammad Suhaib
    Khan, Muhammad Umar Karim
    Kyung, Chong-Min
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4789 - 4796
  • [9] Detection of abnormal fish by image recognition using fine-tuning
    Okawa, Ryusei
    Iwasaki, Nobuo
    Okamoto, Kazuya
    Marsh, David
    ARTIFICIAL LIFE AND ROBOTICS, 2023, 28 (01) : 175 - 180
  • [10] Detection of abnormal fish by image recognition using fine-tuning
    Ryusei Okawa
    Nobuo Iwasaki
    Kazuya Okamoto
    David Marsh
    Artificial Life and Robotics, 2023, 28 : 175 - 180