CFOR: Character-First Open-Set Text Recognition via Context-Free Learning

被引:0
|
作者
Liu, Chang [1 ,2 ]
Yang, Chun [1 ]
Fang, Zhiyu [1 ]
Qin, Hai-Bo [1 ]
Yin, Xu-Cheng [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China
[2] Lulea Tekn Univ, ML Grp, S-97187 Lulea, Sweden
基金
中国国家自然科学基金;
关键词
Zero-shot learning; anomaly detection; text recognition; NETWORK;
D O I
10.1109/TIP.2024.3480711
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The open-set text recognition task is a generalized form of the (close-set) text recognition task, where the model is further challenged to spot and incrementally recognize novel characters not covered by the training data. Novel characters also indicate that the language model of the training set is biased from the "real-world". In this work, we alleviate the confounding effect of such biases by learning from individual character representations isolated from their context. Specifically, we propose a Character-First Open-Set Text Recognition framework that cotrains the feature extractor with two context-free learning tasks. First, a Context Isolation Learning task is proposed to wipe the context for each character from the input image, utilizing a character mask learned in a weak supervision manner. Second, the framework adopts an Individual Character Learning task, which is a single-character classification task with synthetic samples. After training on English and simplified Chinese data, our framework can adapt to recognize unseen characters in Japanese, Korean, Greek, and other scripts without retraining, and can reliably spot unseen characters in Japanese with an F1-score over 64%. The framework also shows 91.5% line accuracy on IIIT5k and a speed of over 69 FPS single-batched, making it a feasible universal lightweight OCR solution that works well for both open-set and close-set use cases.
引用
收藏
页码:6497 / 6507
页数:11
相关论文
共 50 条
  • [21] Graph Open-Set Recognition via Entropy Message Passing
    Yang, Lina
    Lu, Bin
    Gan, Xiaoying
    23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, ICDM 2023, 2023, : 1469 - 1474
  • [22] Deep metric learning method for open-set iris recognition
    Huo, Guang
    Li, Ruyuan
    Lou, Jianlou
    Yu, Xiaolu
    Wang, Jiajun
    He, Xinlei
    Wang, Yue
    JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (03) : 33016
  • [23] Transmitter Identification With Contrastive Learning in Incremental Open-Set Recognition
    Zhang, Xiaoxu
    Huang, Yonghui
    Lin, Meiyan
    Tian, Ye
    An, Junshe
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (03) : 4693 - 4711
  • [24] An Open-Set Modulation Recognition Scheme With Deep Representation Learning
    Chen, Yanghong
    Xu, Xiaodong
    Qin, Xiaowei
    IEEE COMMUNICATIONS LETTERS, 2023, 27 (03) : 851 - 855
  • [25] Contrastive learning based open-set recognition with unknown score
    Zhou, Yuan
    Fang, Songyu
    Li, Shuoshi
    Wang, Boyu
    Kung, Sun -Yuan
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [26] TNPNet: An approach to Few-shot open-set recognition via contextual transductive learning
    Wu, Shaoling
    Luo, Huilan
    Lin, Xiaoming
    NEUROCOMPUTING, 2025, 621
  • [27] On the Effectiveness of Non-negative Matrix Factorization for Text Open-Set Recognition
    Impedovo, Angelo
    Rizzo, Giuseppe
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT III, 2025, 2135 : 541 - 552
  • [28] Learning context-free grammars to extract relations from text
    Petasis, Georgios
    Karkaletsis, Vangelis
    Paliouras, Georgios
    Spyropoulos, Constantine D.
    ECAI 2008, PROCEEDINGS, 2008, 178 : 303 - +
  • [29] Open-Set Domain Adaptation Classification Via Adversarial Learning
    Zhao, Yunbin
    Zhu, Songhao
    Liang, Zhiwei
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7059 - 7063
  • [30] Learning context-free grammar rules from a set of program
    Dubey, A.
    Jalote, P.
    Aggarwal, S. K.
    IET SOFTWARE, 2008, 2 (03) : 223 - 240