XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

被引:25
|
作者
Liu, Wei [1 ]
Liu, Fangyue [1 ]
Ding, Fei [1 ]
He, Qian [1 ]
Yi, Zili [1 ]
机构
[1] ByteDance Ltd, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52688.2022.00775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content representations of the source. However, these few-shot font generation methods either fail to capture content-independent style representations, or employ localized component-wise style representations, which is insufficient to model many Chinese font styles that involve hyper-component features such as inter-component spacing and "connected-stroke". To resolve these drawbacks and make the style representations more reliable, we propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder that is conditioned jointly on the glyph image and the corresponding stroke labels. The cross-modality encoder is pre-trained in a self-supervised manner to allow effective capture of cross- and intra-modality correlations, which facilitates the content-style disentanglement and modeling style representations of all scales (stroke-level, component-level and character-level). The pretrained encoder is then applied to the downstream font generation task without fine-tuning. Experimental comparisons of our method with state-of-the-art methods demonstrate our method successfully transfers styles of all scales. In addition, it only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task (28% lower than the second best).
引用
收藏
页码:7895 / 7904
页数:10
相关论文
共 50 条
  • [41] Cross-Modal Contrastive Pre-Training for Few-Shot Skeleton Action Recognition
    Lu, Mingqi
    Yang, Siyuan
    Lu, Xiaobo
    Liu, Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9798 - 9807
  • [42] Self-Supervised Learning for Few-Shot Medical Image Segmentation
    Ouyang, Cheng
    Biffi, Carlo
    Chen, Chen
    Kart, Turkay
    Qiu, Huaqi
    Rueckert, Daniel
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (07) : 1837 - 1848
  • [43] Self-supervised Prototype Conditional Few-Shot Object Detection
    Kobayashi, Daisuke
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 681 - 692
  • [44] Few-Shot Hyperspectral Image Classification With Self-Supervised Learning
    Li, Zhaokui
    Guo, Hui
    Chen, Yushi
    Liu, Cuiwei
    Du, Qian
    Fang, Zhuoqun
    Wang, Yan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [45] Multi-task Self-supervised Few-Shot Detection
    Zhang, Guangyong
    Duan, Lijuan
    Wang, Wenjian
    Gong, Zhi
    Ma, Bian
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XII, 2024, 14436 : 107 - 119
  • [46] Self-Supervised Task Augmentation for Few-Shot Intent Detection
    Peng-Fei Sun
    Ya-Wen Ouyang
    Ding-Jie Song
    Xin-Yu Dai
    Journal of Computer Science and Technology, 2022, 37 : 527 - 538
  • [47] SELF-SUPERVISED CLASS-COGNIZANT FEW-SHOT CLASSIFICATION
    Shirekar, Ojas Kishore
    Jamali-Rad, Hadi
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 976 - 980
  • [48] Self-Supervised Approach for Few-shot Hand Gesture Recognition
    Kimura, Naoki
    ADJUNCT PROCEEDINGS OF THE 35TH ACM SYMPOSIUM ON USER INTERFACE SOFTWARE & TECHNOLOGY, UIST 2022, 2022,
  • [49] SELF-SUPERVISED LEARNING FOR FEW-SHOT BIRD SOUND CLASSIFICATION
    Moummad, Ilyass
    Farrugia, Nicolas
    Serizel, Romain
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 600 - 604
  • [50] Self-Supervised Task Augmentation for Few-Shot Intent Detection
    Sun, Peng-Fei
    Ouyang, Ya-Wen
    Song, Ding-Jie
    Dai, Xin-Yu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (03) : 527 - 538