Large-Scale Visual Font Recognition

被引:22
|
作者
Chen, Guang [2 ]
Yang, Jianchao [1 ]
Jin, Hailin [1 ]
Brandt, Jonathan [1 ]
Shechtman, Eli [1 ]
Agarwala, Aseem [1 ]
Han, Tony X. [2 ]
机构
[1] Adobe Res, San Jose, CA USA
[2] Univ Missouri, Columbia, MO 65211 USA
关键词
D O I
10.1109/CVPR.2014.460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the large-scale visual font recognition (VFR) problem, which aims at automatic identification of the typeface, weight, and slope of the text in an image or photo without any knowledge of content. Although visual font recognition has many practical applications, it has largely been neglected by the vision community. To address the VFR problem, we construct a large-scale dataset containing 2,420 font classes, which easily exceeds the scale of most image categorization datasets in computer vision. As font recognition is inherently dynamic and open-ended, i.e., new classes and data for existing categories are constantly added to the database over time, we propose a scalable solution based on the nearest class mean classifier (NCM). The core algorithm is built on local feature embedding, local feature metric learning and max-margin template selection, which is naturally amenable to NCM and thus to such open-ended classification problems. The new algorithm can generalize to new classes and new data at little added cost. Extensive experiments demonstrate that our approach is very effective on our synthetic test images, and achieves promising results on real world test images.
引用
收藏
页码:3598 / 3605
页数:8
相关论文
共 50 条
  • [1] Large-Scale Visual Speech Recognition
    Shillingford, Brendan
    Assael, Yannis
    Hoffman, Matthew W.
    Paine, Thomas
    Hughes, Cian
    Prabhu, Utsav
    Liao, Hank
    Sak, Hasim
    Rao, Kanishka
    Bennett, Lorrayne
    Mulville, Marie
    Denil, Misha
    Coppin, Ben
    Laurie, Ben
    Senior, Andrew
    de Freitas, Nando
    [J]. INTERSPEECH 2019, 2019, : 4135 - 4139
  • [2] Sparse Output Coding for Large-Scale Visual Recognition
    Zhao, Bin
    Xing, Eric P.
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3350 - 3357
  • [3] Embedding Visual Hierarchy With Deep Networks for Large-Scale Visual Recognition
    Zhao, Tianyi
    Zhang, Baopeng
    He, Ming
    Zhang, Wei
    Zhou, Ning
    Yu, Jun
    Fan, Jianping
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) : 4740 - 4755
  • [4] Discriminative Learning of Relaxed Hierarchy for Large-scale Visual Recognition
    Gao, Tianshi
    Koller, Daphne
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 2072 - 2079
  • [5] Three Guidelines of Online Learning for Large-Scale Visual Recognition
    Ushiku, Yoshitaka
    Hidaka, Masatoshi
    Harada, Tatsuya
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3574 - 3581
  • [6] Deep Mixture of Diverse Experts for Large-Scale Visual Recognition
    Zhao, Tianyi
    Chen, Qiuyu
    Kuang, Zhenzhong
    Yu, Jun
    Zhang, Wei
    Fan, Jianping
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (05) : 1072 - 1087
  • [7] Fast Learning Discriminative Dictionaries for Large-scale Visual Recognition
    Zhao, Tianyi
    Qu, Yanyun
    Fan, Jianping
    [J]. 2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2015,
  • [8] Distributed training of CosPlace for large-scale visual place recognition
    Zaccone, Riccardo
    Berton, Gabriele
    Masone, Carlo
    [J]. FRONTIERS IN ROBOTICS AND AI, 2024, 11
  • [9] Heterogenous Action Ensembling for Visual Recognition of Large-Scale Actions
    Rouali, Mohamed Lamine
    Amamra, Abdenour
    Boulahia, Said Yacine
    Benatia, Mohamed Akram
    [J]. ADVANCES IN COMPUTING SYSTEMS AND APPLICATIONS, 2022, 513 : 361 - 371
  • [10] HanFont: large-scale adaptive Hangul font recognizer using CNN and font clustering
    Yang, Jinhyeok
    Kim, Heebeom
    Kwak, Hyobin
    Kim, Injung
    [J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (04) : 407 - 416