Deep Learning Based Tangut Character Recognition

被引:0
|
作者
Zhang, Guangwei [1 ]
Han, Xiaomang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Hist & Civilizat, Xian, Shaanxi, Peoples R China
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Tangut script, a logographic writing system, was used for writing the extinct Tangut language of the West Xia Dynasty. The huge amount of Tangut historical documents are being mainly recognized by Tangut experts manually, because the Tangut language has not been used since 16th century and it was impossible to recognized automatically in the past. With the help of deep learning, we build an end-to-end Tangut character recognition system to reduce the labor of Tangut experts. The high accuracy of a deep learning system for character recognition is essentially guaranteed by a large training dataset of well-labeled data. We construct a training dataset containing more than 100,000 labeled Tangut images, which is used for training a deep convolutional neural network (DCNN) to recognize Tangut characters. The Tangut images in the training dataset are from Tangut historical documents and they are labeled in a cluster-and- label way to reduce the human efforts. Based on the training dataset, the validation accuracy of the DCNN is more than 94% according to our experiments. We will release the training dataset for further study and construct an OCR system for transcribing Tangut historical documents automatically in the future.
引用
收藏
页码:437 / 441
页数:5
相关论文
共 50 条
  • [1] The Recognition and Implementation of Handwritten Character based on Deep Learning
    Dai, Fengzhi
    Ye, Zhongyong
    Jin, Xia
    [J]. JOURNAL OF ROBOTICS NETWORKING AND ARTIFICIAL LIFE, 2019, 6 (01): : 52 - 55
  • [2] The recognition and implementation of handwritten character based on deep learning
    Ye, Zhongyong
    Dai, Fengzhi
    Jin, Xia
    Yuan, Yasheng
    An, Lingran
    Yan, Yujie
    Qin, Yiqiao
    Li, Hao
    [J]. ICAROB 2019: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL LIFE AND ROBOTICS, 2019, : 276 - 279
  • [3] Deep Learning Based Ancient Asian Character Recognition
    Atsumi, Masahiko
    Kawano, Syunsuke
    Morioka, Tomoki
    Meng, Lin
    [J]. 2020 INTERNATIONAL CONFERENCE ON ADVANCED MECHATRONIC SYSTEMS (ICAMECHS), 2020, : 296 - 301
  • [4] Deep Learning Based Gujarati Handwritten Character Recognition
    Joshi, Dhara S.
    Risodkar, Yogesh R.
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMMUNICATION AND COMPUTING TECHNOLOGY (ICACCT), 2018, : 563 - 566
  • [5] Deep Learning based Isolated Arabic Scene Character Recognition
    Bin Ahmed, Saad
    Naz, Saeeda
    Razzak, Muhammad Imran
    Yousaf, Rubiyah
    [J]. 2017 1ST INTERNATIONAL WORKSHOP ON ARABIC SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2017, : 46 - 51
  • [6] A Deep Learning-based Unified Solution for Character Recognition
    Das, Avishek
    Rabby, A. K. M. Shahariar Azad
    Kowsar, Ibna
    Rahman, Fuad
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1671 - 1677
  • [7] Deep Learning Based Sinhala Optical Character Recognition (OCR)
    Anuradha, Isuri
    Liyanage, Chamila
    Wijayawardhana, Harsha
    Weerasinghe, Ruvan
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER-2020), 2020, : 298 - 299
  • [8] A Case Study on Rubbing Character Recognition Based on Deep Learning
    Meng, Zelin
    Zhang, Zhiyu
    Meng, Lin
    Tomiyama, Hiroyuki
    [J]. 2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 318 - 319
  • [9] Decision tree and deep learning based probabilistic model for character recognition
    A.K.Sampath
    Dr.N.Gomathi
    [J]. Journal of Central South University, 2017, 24 (12) : 2862 - 2876
  • [10] Research on Offline Handwritten Chinese Character Recognition Based on Deep Learning
    Hao, Qiuyun
    Wu, Xiaoming
    Zhang, Sen
    Zhang, Peng
    Ma, Xiaofeng
    Jiang, Jingsai
    [J]. 2019 9TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST2019), 2019, : 470 - 474