Character-based handwritten text transcription with attention networks

被引:0
|
作者
Jason Poulos
Rafael Valle
机构
[1] Duke University,Department of Statistical Science
[2] The Statistical and Applied Mathematical Sciences Institute,undefined
[3] NVIDIA Corporation,undefined
来源
关键词
Attention; Convolutional neural networks; Handwritten text recognition; Recurrent neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
The paper approaches the task of handwritten text recognition (HTR) with attentional encoder–decoder networks trained on sequences of characters, rather than words. We experiment on lines of text from popular handwriting datasets and compare different activation functions for the attention mechanism used for aligning image pixels and target characters. We find that softmax attention focuses heavily on individual characters, while sigmoid attention focuses on multiple characters at each step of the decoding. When the sequence alignment is one-to-one, softmax attention is able to learn a more precise alignment at each step of the decoding, whereas the alignment generated by sigmoid attention is much less precise. When a linear function is used to obtain attention weights, the model predicts a character by looking at the entire sequence of characters and performs poorly because it lacks a precise alignment between the source and target. Future research may explore HTR in natural scene images, since the model is capable of transcribing handwritten text without the need for producing segmentations or bounding boxes of text in images.
引用
收藏
页码:10563 / 10573
页数:10
相关论文
共 50 条
  • [1] Character-based handwritten text transcription with attention networks
    Poulos, Jason
    Valle, Rafael
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (16): : 10563 - 10573
  • [2] gMLP guided deep networks model for character-based handwritten text transcription
    Mouad Bensouilah
    Mokhtar Taffar
    Mohamed Nadjib Zennir
    [J]. Multimedia Tools and Applications, 2024, 83 : 13557 - 13575
  • [3] gMLP guided deep networks model for character-based handwritten text transcription
    Bensouilah, Mouad
    Taffar, Mokhtar
    Zennir, Mohamed Nadjib
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) : 13557 - 13575
  • [4] Character-Based Handwritten Text Recognition of Multilingual Documents
    del Agua, Miguel A.
    Serrano, Nicolas
    Civera, Jorge
    Juan, Alfons
    [J]. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES, 2012, 328 : 187 - 196
  • [5] A character-based postprocessing system for handwritten Japanese address recognition
    Yamanaka, K
    Kuroyanagi, S
    Iwata, A
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1999, E82D (02) : 468 - 474
  • [6] Incorporating Word Attention into Character-Based Word Segmentation
    Higashiyama, Shohei
    Utiyama, Masao
    Sumita, Eiichiro
    Ideuchi, Masao
    Oida, Yoshiaki
    Sakamoto, Yohei
    Okada, Isaac
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2699 - 2709
  • [7] AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
    Kass, Dmitrijs
    Vats, Ekta
    [J]. DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 507 - 522
  • [8] Image and Text Fusion for Character-based Breast Cancer Classification
    Qiao, Pan
    Jin, Yanhong
    Chen, Dehua
    Zhang, YuanYuan
    [J]. IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 298 - 305
  • [9] Handwritten Chinese Character Recognition Based on Attention Mechanism
    Huang Wanrong
    He Kai
    Liu Kun
    Gao Shengnan
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (08)
  • [10] Low-resource neural character-based noisy text normalization
    Mager, Manuel
    Jasso Rosales, Monica
    Cetinoglu, Ozlem
    Meza, Ivan
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (05) : 4921 - 4929