End-to-End Detection and Recognition of Arithmetic Expressions

被引:1
|
作者
Wan, Jiangpeng [1 ]
Zhao, Mengbiao [2 ,3 ]
Yin, Fei [2 ,3 ]
Zhang, Xu-Yao [2 ,3 ]
Huang, LinLin [1 ]
机构
[1] Beijing Jiaotong Univ, Beijing 100044, Peoples R China
[2] Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
关键词
Arithmetic expression spotting; End-to-End training; Sequence-to-sequence;
D O I
10.1007/978-3-030-88004-0_41
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The detection and recognition of handwritten arithmetic expressions (AEs) play an important role in document retrieval [ 21] and analysis. They are very difficult because of the structural complexity and the variability of appearance. In this paper, we propose a novel framework to detect and recognize AEs in an End-to-End manner. Firstly, an AE detector based on EfficientNet-B1 [17] is designed to locate all AE instances efficiently. Upon AE location, the RoI Rotate module [11] is adopted to transform visual features for AE proposals. The transformed features are then fed into an attention mechanism based recognizer for AE recognition. The whole network for detection and recognition is trained End-to-End on document images annotated AE locations and transcripts. Since the datasets in this field are rare, we also construct a dataset named HAED, which contains 1069 images (855 for training, and 214 for testing). Extensive experiments on two datasets (HAED and TFD-ICDAR 2019) show that the proposed method has achieved competitive performance on both datasets.
引用
收藏
页码:505 / 517
页数:13
相关论文
共 50 条
  • [1] End-to-End Spatial Transform Face Detection and Recognition
    Zhang, Hongxin
    Chi, Liying
    [J]. Virtual Reality and Intelligent Hardware, 2020, 2 (02): : 119 - 131
  • [2] An end-to-end framework for the detection of mathematical expressions in scientific document images
    Phong, Bui Hai
    Hoang, Thang Manh
    Le, Thi-Lan
    [J]. EXPERT SYSTEMS, 2022, 39 (01)
  • [3] End-to-End Multimodal Emotion Recognition Based on Facial Expressions and Remote Photoplethysmography Signals
    Li, Jixiang
    Peng, Jianxin
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (10) : 6054 - 6063
  • [4] END-TO-END TRAINING OF A LARGE VOCABULARY END-TO-END SPEECH RECOGNITION SYSTEM
    Kim, Chanwoo
    Kim, Sungsoo
    Kim, Kwangyoun
    Kumar, Mehul
    Kim, Jiyeon
    Lee, Kyungmin
    Han, Changwoo
    Garg, Abhinav
    Kim, Eunhyang
    Shin, Minkyoo
    Singh, Shatrughan
    Heck, Larry
    Gowda, Dhananjaya
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 562 - 569
  • [5] EEM: An End-to-end Evaluation Metric for Scene Text Detection and Recognition
    Hao, Jiedong
    Wen, Yafei
    Deng, Jie
    Gan, Jun
    Ren, Shuai
    Tan, Hui
    Chen, Xiaoxin
    [J]. DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 95 - 108
  • [6] End-to-End Analysis for Text Detection and Recognition in Natural Scene Images
    Alnefaie, Ahlam
    Gupta, Deepak
    Bhuyan, Monowar H.
    Razzak, Imran
    Gupta, Prashant
    Prasad, Mukesh
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] A Light CNN for End-to-End Car License Plates Detection and Recognition
    Wang, Wanwei
    Yang, Jun
    Chen, Min
    Wang, Peng
    [J]. IEEE ACCESS, 2019, 7 : 173875 - 173883
  • [8] End-to-end multibranch network for palm vein recognition and liveness detection
    Shen, Wenzhong
    Liang, Juan
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (01)
  • [9] FVI: An End-to-end Vietnamese Identification Card Detection and Recognition in Images
    Hoang Danh Liem
    Nguyen Duc Minh
    Nguyen Bao Trung
    Hoang Tien Duc
    Pham Hoang Hiep
    Doan Viet Dung
    Dang Hoang Vu
    [J]. PROCEEDINGS OF 2018 5TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS 2018), 2018, : 338 - 340
  • [10] END-TO-END MULTIMODAL SPEECH RECOGNITION
    Palaskar, Shruti
    Sanabria, Ramon
    Metze, Florian
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5774 - 5778