End-to-End Handwritten Text Detection and Transcription in Full Pages

被引：12

作者：

Carbonell, Manuel ^{[1
]}

Mas, Joan ^{[2
]}

Villegas, Mauricio ^{[1
]}

Fornes, Alicia ^{[2
]}

Llados, Josep ^{[2
]}

机构：

[1] Omni Us, Berlin, Germany

[2] Comp Vis Ctr, Barcelona, Spain

来源：

2019 INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION WORKSHOPS (ICDARW), VOL 5 | 2019年

关键词：

Handwritten Text Recognition; Layout Analysis; Text segmentation; Deep Neural Networks; Multi-task learning;

D O I：

10.1109/ICDARW.2019.40077

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

When transcribing handwritten document images, inaccuracies in the text segmentation step often cause errors in the subsequent transcription step. For this reason, some recent methods propose to perform the recognition at paragraph level. But still, errors in the segmentation of paragraphs can affect the transcription performance. In this work, we propose an end-to-end framework to transcribe full pages. The joint text detection and transcription allows to remove the layout analysis requirement at test time. The experimental results show that our approach can achieve comparable results to models that assume segmented paragraphs, and suggest that joining the two tasks brings an improvement over doing the two tasks separately.

引用

页码：29 / 34

页数：6

共 50 条

[21] Towards End-to-End Unified Scene Text Detection and Layout Analysis
Long, Shangbang
Qin, Siyang
Panteleev, Dmitry
Bissacco, Alessandro
Fujii, Yasuhisa
Raptis, Michalis
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1039 - 1049
[22] CurT: End-to-End Text Line Detection in Historical Documents with Transformers
Kiessling, Benjamin
FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 34 - 48
[23] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
Alnefaie, Ahlam
Gupta, Deepak
Bhuyan, Monowar H.
Razzak, Imran
Gupta, Prashant
Prasad, Mukesh
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[24] An end-to-end text spotter with text relation networks
Jianguo Jiang
Baole Wei
Min Yu
Gang Li
Boquan Li
Chao Liu
Min Li
Weiqing Huang
Cybersecurity, 4
[25] Feature Fusion Pyramid Network for End-to-End Scene Text Detection
Wu, Yirui
Zhang, Lilai
Li, Hao
Zhang, Yunfei
Wan, Shaohua
ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23 (11)
[26] An end-to-end text spotter with text relation networks
Jiang, Jianguo
Wei, Baole
Yu, Min
Li, Gang
Li, Boquan
Liu, Chao
Li, Min
Huang, Weiqing
CYBERSECURITY, 2021, 4 (01)
[27] An End-to-End Scene Text Recognition for Bilingual Text
Albalawi, Bayan M.
Jamal, Amani T.
Al Khuzayem, Lama A.
Alsaedi, Olaa A.
BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (09)
[28] PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition
Dezhi Peng
Lianwen Jin
Yuliang Liu
Canjie Luo
Songxuan Lai
International Journal of Computer Vision, 2022, 130 : 2623 - 2645
[29] PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition
Peng, Dezhi
Jin, Lianwen
Liu, Yuliang
Luo, Canjie
Lai, Songxuan
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2623 - 2645
[30] End-to-End Full Projector Compensation
Huang, Bingyao
Sun, Tao
Ling, Haibin
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (06) : 2953 - 2967

← 1 2 3 4 5 →