Human-like cognition: visual features grouping for hard-to-group text dataset

被引:0
|
作者
Li, Xin [1 ]
Liu, Hangyuan [1 ]
Tao, Chunfeng [2 ]
Han, Ruiyi [1 ]
Yang, Shumin [1 ]
机构
[1] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao, Peoples R China
[2] Bur Geophys Prospecting Inc BGPCNPC, Zhuozhou, Peoples R China
关键词
scene text spotting; visual features grouping; text correction;
D O I
10.1117/1.JEI.33.2.023002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most existing arbitrary shape text detection methods employ connected components and text center lines for grouping text instances, which assume that texts in adjacent positions belong to the same instance. However, many hard-to-group scene texts are too complex to be effectively processed in this way. To address this challenge, we propose a novel scene text-spotting method that utilizes feature-based clustering inspired by human cognitive principles of text perception. Our approach involves first utilizing a character spotter to obtain the location and the transcription information of the characters. Then, a lightweight recognition network extracts the visual features of the characters by their locations. These visual features are then grouped into instances through a K-means-fuzzy-net, which explicitly model visual feature similarity to effectively group the nested text, the large-margin text, the continuous text, and the one with overlapping characters. Finally, the recognition results of text instances are processed by a word correction module to improve the overall accuracy and reduce the vulnerability of individual character detection. Additionally, we have contributed a hard-to-group text dataset. Experiments demonstrate the state-of-the-art performance of our method in addressing scenarios. Hard-to-group text dataset is available at: https://github.com/baggio321/Hard-to-Group-Text-Dataset. (c) 2024 SPIE and IS&T
引用
收藏
页数:18
相关论文
共 50 条
  • [1] HVLM: Exploring Human-Like Visual Cognition and Language-Memory Network for Visual Dialog
    Sun, Kaili
    Guo, Chi
    Zhang, Huyin
    Li, Yuan
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (05)
  • [2] Aspects of human-like cognition and AI learning
    Richter, MM
    ADAPTIVITY AND LEARNING: AN INTERDISCIPLINARY DEBATE, 2003, : 283 - 284
  • [3] Human-like Visual Learning and Reasoning
    Cui, Peng
    Zhu, Wenwu
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1951 - 1952
  • [4] Research on Expansion Method of Detection Dataset for "Human-like" Socialbots
    Liu X.
    Xu Y.
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2022, 51 (01): : 130 - 137
  • [5] Chimpanzees show human-like shifts in cognition during adolescence
    Rosati, Alexandra G.
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2016, 159 : 272 - 272
  • [6] Editorial of the Special Issue on Human-like Behavior and Cognition in Robots
    Marwen Belkaid
    Giorgio Metta
    Tony Prescott
    Agnieszka Wykowska
    International Journal of Social Robotics, 2023, 15 : 1261 - 1263
  • [7] Editorial of the Special Issue on Human-like Behavior and Cognition in Robots
    Belkaid, Marwen
    Metta, Giorgio
    Prescott, Tony
    Wykowska, Agnieszka
    INTERNATIONAL JOURNAL OF SOCIAL ROBOTICS, 2023, 15 (8) : 1261 - 1263
  • [8] Refining LLMs with Reinforcement Learning for Human-Like Text Generation
    Harish, Aditya
    Prakash, Gaurav
    Nair, Ronith R.
    Iyer, Varun Bhaskaran
    Kumar, Anand M.
    10TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES, CONECCT 2024, 2024,
  • [9] Text Genre and Training Data Size in Human-Like Parsing
    Hale, John T.
    Kuncoro, Adhiguna
    Hall, Keith B.
    Dyer, Chris
    Brennan, Jonathan R.
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5846 - 5852
  • [10] Exploring Human-Like Reading Strategy for Abstractive Text Summarization
    Yang, Min
    Qu, Qiang
    Tu, Wenting
    Shen, Ying
    Zhao, Zhou
    Chen, Xiaojun
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 7362 - 7369