XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

被引:25
|
作者
Liu, Wei [1 ]
Liu, Fangyue [1 ]
Ding, Fei [1 ]
He, Qian [1 ]
Yi, Zili [1 ]
机构
[1] ByteDance Ltd, Beijing, Peoples R China
关键词
D O I
10.1109/CVPR52688.2022.00775
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generating a new font library is a very labor-intensive and time-consuming job for glyph-rich scripts. Few-shot font generation is thus required, as it requires only a few glyph references without fine-tuning during test. Existing methods follow the style-content disentanglement paradigm and expect novel fonts to be produced by combining the style codes of the reference glyphs and the content representations of the source. However, these few-shot font generation methods either fail to capture content-independent style representations, or employ localized component-wise style representations, which is insufficient to model many Chinese font styles that involve hyper-component features such as inter-component spacing and "connected-stroke". To resolve these drawbacks and make the style representations more reliable, we propose a self-supervised cross-modality pre-training strategy and a cross-modality transformer-based encoder that is conditioned jointly on the glyph image and the corresponding stroke labels. The cross-modality encoder is pre-trained in a self-supervised manner to allow effective capture of cross- and intra-modality correlations, which facilitates the content-style disentanglement and modeling style representations of all scales (stroke-level, component-level and character-level). The pretrained encoder is then applied to the downstream font generation task without fine-tuning. Experimental comparisons of our method with state-of-the-art methods demonstrate our method successfully transfers styles of all scales. In addition, it only requires one reference glyph and achieves the lowest rate of bad cases in the few-shot font generation task (28% lower than the second best).
引用
收藏
页码:7895 / 7904
页数:10
相关论文
共 50 条
  • [21] CDS: Cross-Domain Self-supervised Pre-training
    Kim, Donghyun
    Saito, Kuniaki
    Oh, Tae-Hyun
    Plummer, Bryan A.
    Sclaroff, Stan
    Saenko, Kate
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112
  • [22] Effectiveness of Pre-training for Few-shot Intent Classification
    Zhang, Haode
    Zhang, Yuwei
    Zhan, Li-Ming
    Chen, Jiaxin
    Shi, Guangyuan
    Wu, Xiao-Ming
    Lam, Albert Y. S.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1114 - 1120
  • [23] Conditional Self-Supervised Learning for Few-Shot Classification
    An, Yuexuan
    Xue, Hui
    Zhao, Xingyu
    Zhang, Lu
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2140 - 2146
  • [24] Self-Supervised Few-Shot Learning on Point Clouds
    Sharma, Charu
    Kaul, Manohar
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [25] FLUID: Few-Shot Self-Supervised Image Deraining
    Rai, Shyam Nandan
    Saluja, Rohit
    Arora, Chetan
    Balasubramanian, Vineeth N.
    Subramanian, Anbumani
    Jawahar, C., V
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 418 - 427
  • [26] Improving In-Context Few-Shot Learning via Self-Supervised Training
    Chen, Mingda
    Du, Jingfei
    Pasunuru, Ramakanth
    Mihaylov, Todor
    Iyer, Srini
    Stoyanov, Veselin
    Kozareva, Zornitsa
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3558 - 3573
  • [27] Self-supervised Network Evolution for Few-shot Classification
    Tang, Xuwen
    Teng, Zhu
    Zhang, Baopeng
    Fan, Jianping
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3045 - 3051
  • [28] A Survey of Self-Supervised and Few-Shot Object Detection
    Huang, Gabriel
    Laradji, Issam
    Vazquez, David
    Lacoste-Julien, Simon
    Rodriguez, Pau
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4071 - 4089
  • [29] SELF-SUPERVISED LEARNING FOR FEW-SHOT IMAGE CLASSIFICATION
    Chen, Da
    Chen, Yuefeng
    Li, Yuhong
    Mao, Feng
    He, Yuan
    Xue, Hui
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1745 - 1749
  • [30] MF-Net: A Novel Few-shot Stylized Multilingual Font Generation Method
    Zhang, Yufan
    Man, Junkai
    Sun, Peng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2088 - 2096