Japanese historical character recognition by focusing on character parts

被引:0
|
作者
Ishikawa, Takuru [1 ]
Miyazaki, Tomo [1 ]
Omachi, Shinichiro [1 ]
机构
[1] Tohoku Univ, Grad Sch Engn, 6-6-05 Aoba Aramakiaza, Sendai, Miyagi 9808579, Japan
关键词
Historical document analysis; Japanese historical character; Learning character parts; Few-shot; Zero-shot recognition;
D O I
10.1016/j.patcog.2023.110181
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Japanese historical documents provide valuable information. Character recognition is a critical technology for the digitalization of historical documents. Sample imbalance is a significant obstacle in recognizing Japanese historical characters, kuzushiji. Thousands of kuzushiji only have less than a few samples. Thus, recognition performance deteriorates greatly in kuzushiji with a few samples. In this study, we propose a framework for transferring knowledge of character parts from font to kuzushiji. The pretraining learns character parts from synthesized font images. However, fine-tuning to kuzushiji is more complex. We propose calculating a mean squared error loss between feature vectors of kuzushiji and font images, resulting in consistent feature vectors in kuzushiji and font. Consequently, we can perform zero-shot recognition for kuzushiji using the font images of zero-sampled kuzushiji. The experimental results show that the proposed method recognized zero-sampled kuzushiji at approximately 48% accuracy. Consequently, we significantly expand the number of recognizable kuzushiji.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Character recognition
    Moltenbrey, K
    COMPUTER GRAPHICS WORLD, 2004, 27 (04) : 34 - 36
  • [22] CHARACTER RECOGNITION
    PETERSEN, DP
    MARTINO, JP
    IEEE SPECTRUM, 1969, 6 (10) : 8 - &
  • [23] Synthetic Scene Character Generator and Multi-Scale Voting Classifier for Japanese Scene Character Recognition
    Horie, Fuma
    Goto, Hideaki
    2018 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2018,
  • [24] Hierarchical Character Grouping and Recognition of Character Using Character Intensity Code
    Bharathi, V. C.
    Geetha, M. Kalaiselvi
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 : 789 - 797
  • [25] Chinese Character Recognition Based on Character Reconstruction
    Yun Li
    Mei Xie
    2009 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLUMES I & II: COMMUNICATIONS, NETWORKS AND SIGNAL PROCESSING, VOL I/ELECTRONIC DEVICES, CIRUITS AND SYSTEMS, VOL II, 2009, : 460 - 463
  • [26] An Automatic Method for Enhancing Character Recognition in Degraded Historical Documents
    Pereira e Silva, Gabriel
    Lins, Rafael Dueire
    11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 553 - 557
  • [27] Optical Character Recognition Techniques for Restoration of Thai Historical Documents
    Tangwongsan, Supachai
    Sumetphong, Chaivatna
    ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 531 - 535
  • [28] CNN based Transfer Learning for Historical Chinese Character Recognition
    Tang, Yejun
    Peng, Liangrui
    Xu, Qian
    Wang, Yanwei
    Furuhata, Akio
    PROCEEDINGS OF 12TH IAPR WORKSHOP ON DOCUMENT ANALYSIS SYSTEMS, (DAS 2016), 2016, : 25 - 29
  • [29] Semantic radicals in Japanese two-character word recognition
    Miwa, Koji
    Libben, Gary
    Baayen, Harald
    LANGUAGE AND COGNITIVE PROCESSES, 2012, 27 (01): : 142 - 158
  • [30] A Robust System for Online Handwritten Chinese/Japanese Character Recognition
    Zhu, B. L.
    Nakagawa, Masaki
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION TECHNOLOGY (SEIT2015), 2016, : 247 - 254