XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

被引:25
|
作者
Liu, Wei [1 ]
Liu, Fangyue [1 ]
Ding, Fei [1 ]
He, Qian [1 ]
Yi, Zili [1 ]
机构
[1] ByteDance Ltd, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52688.2022.00775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content representations of the source. However, these few-shot font generation methods either fail to capture content-independent style representations, or employ localized component-wise style representations, which is insufficient to model many Chinese font styles that involve hyper-component features such as inter-component spacing and "connected-stroke". To resolve these drawbacks and make the style representations more reliable, we propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder that is conditioned jointly on the glyph image and the corresponding stroke labels. The cross-modality encoder is pre-trained in a self-supervised manner to allow effective capture of cross- and intra-modality correlations, which facilitates the content-style disentanglement and modeling style representations of all scales (stroke-level, component-level and character-level). The pretrained encoder is then applied to the downstream font generation task without fine-tuning. Experimental comparisons of our method with state-of-the-art methods demonstrate our method successfully transfers styles of all scales. In addition, it only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task (28% lower than the second best).
引用
收藏
页码:7895 / 7904
页数:10
相关论文
共 50 条
  • [1] CLIP-FONT: SEMENTIC SELF-SUPERVISED FEW-SHOT FONT GENERATION WITH CLIP
    Xiong, Jialu
    Wang, Yefei
    Zeng, Jinshan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 3620 - 3624
  • [2] Font transformer for few-shot font generation
    Chen, Xu
    Wu, Lei
    Su, Yongliang
    Meng, Lei
    Meng, Xiangxu
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 245
  • [3] Few-Shot Font Generation With Weakly Supervised Localized Representations
    Park, Song
    Chun, Sanghyuk
    Cha, Junbum
    Lee, Bado
    Shim, Hyunjung
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1479 - 1495
  • [4] CF-Font: Content Fusion for Few-shot Font Generation
    Wang, Chi
    Zhou, Min
    Ge, Tiezheng
    Jiang, Yuning
    Bao, Hujun
    Xu, Weiwei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1858 - 1867
  • [5] FA-Font: Feature Aggregation for Few-shot Font Generation
    Lv, Guojin
    Zhao, Hongyu
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ALGORITHMS, SOFTWARE ENGINEERING, AND NETWORK SECURITY, ASENS 2024, 2024, : 254 - 261
  • [6] Few-Shot Content-Level Font Generation
    Majeed, Saima
    Ul Hassan, Ammar
    Choi, Jaeyoung
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (04): : 1166 - 1186
  • [7] Few-Shot Font Generation with Deep Metric Learning
    Aoki, Haruka
    Tsubota, Koki
    Ikuta, Hikaru
    Aizawa, Kiyoharu
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8539 - 8546
  • [8] MA-Font: Few-Shot Font Generation by Multi-Adaptation Method
    Qiu, Yanbo
    Chu, Kaibin
    Zhang, Ji
    Feng, Chengtao
    IEEE ACCESS, 2024, 12 : 60765 - 60781
  • [9] Reinforced Self-Supervised Training for Few-Shot Learning
    Yan, Zhichao
    An, Yuexuan
    Xue, Hui
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 731 - 735
  • [10] Pareto Self-Supervised Training for Few-Shot Learning
    Chen, Zhengyu
    Ge, Jixie
    Zhan, Heshen
    Huang, Siteng
    Wang, Donglin
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 13658 - 13667