Human-like cognition: visual features grouping for hard-to-group text dataset

被引：0

作者：

Li, Xin ^{[1
]}

Liu, Hangyuan ^{[1
]}

Tao, Chunfeng ^{[2
]}

Han, Ruiyi ^{[1
]}

Yang, Shumin ^{[1
]}

机构：

[1] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao, Peoples R China

[2] Bur Geophys Prospecting Inc BGPCNPC, Zhuozhou, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2024年 / 33卷 / 02期

关键词：

scene text spotting; visual features grouping; text correction;

D O I：

10.1117/1.JEI.33.2.023002

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Most existing arbitrary shape text detection methods employ connected components and text center lines for grouping text instances, which assume that texts in adjacent positions belong to the same instance. However, many hard-to-group scene texts are too complex to be effectively processed in this way. To address this challenge, we propose a novel scene text-spotting method that utilizes feature-based clustering inspired by human cognitive principles of text perception. Our approach involves first utilizing a character spotter to obtain the location and the transcription information of the characters. Then, a lightweight recognition network extracts the visual features of the characters by their locations. These visual features are then grouped into instances through a K-means-fuzzy-net, which explicitly model visual feature similarity to effectively group the nested text, the large-margin text, the continuous text, and the one with overlapping characters. Finally, the recognition results of text instances are processed by a word correction module to improve the overall accuracy and reduce the vulnerability of individual character detection. Additionally, we have contributed a hard-to-group text dataset. Experiments demonstrate the state-of-the-art performance of our method in addressing scenarios. Hard-to-group text dataset is available at: https://github.com/baggio321/Hard-to-Group-Text-Dataset. (c) 2024 SPIE and IS&T

引用

页数：18

共 50 条

[31] Interactive learning and management of visual information via human-like software robot
Hasegawa, O
Sakaue, K
Hayamizu, S
NEW GENERATION COMPUTING, 2000, 18 (02) : 103 - 116
[32] Interactive learning and management of visual information via human-like software robot
Osamu Hasegawa
Katsuhiko Sakaue
Satoru Hayamizu
New Generation Computing, 2000, 18 : 103 - 116
[33] Perceived animacy influences the processing of human-like surface features in the fusiform gyrus
Shultz, Sarah
McCarthy, Gregory
NEUROPSYCHOLOGIA, 2014, 60 : 115 - 120
[34] Analysis of the human connectome data supports the notion of a "Common Model of Cognition" for human and human-like intelligence across domains
Stocco, Andrea
Sibert, Catherine
Steine-Hanson, Zoe
Koh, Natalie
Laird, John E.
Lebiere, Christian J.
Rosenbloom, Paul
NEUROIMAGE, 2021, 235
[35] Evolution of Human-Like Social Grooming Strategies Regarding Richness and Group Size
Takano, Masanori
Ichinose, Genki
FRONTIERS IN ECOLOGY AND EVOLUTION, 2018, 6
[36] From Hard to Soft: Towards more Human-like Emotion Recognition by Modelling the Perception Uncertainty
Han, Jing
Zhang, Zixing
Schmitt, Maximilian
Pantic, Maja
Schuller, Bjoern
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 890 - 897
[37] Think from Words(TFW): Initiating Human-Like Cognition in Large Language Models Through Think from Words for Japanese Text-Level Classification
Gan, Chengguang
Zhang, Qinghao
Mori, Tatsunori
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT II, NLDB 2024, 2024, 14763 : 43 - 55
[38] Visual pop-out in barn owls: Human-like behavior in the avian brain
Orlowski, Julius
Beissel, Christian
Rohn, Friederike
Adato, Yair
Wagner, Hermann
Ben-Shahar, Ohad
JOURNAL OF VISION, 2015, 15 (14):
[39] Think from Words(TFW): Initiating Human-Like Cognition in Large Language Models Through Think from Words for Japanese Text-level Classification
Gan, Chengguang
Zhang, Qinghao
Mori, Tatsunori
arXiv, 2023,
[40] Cognitive empathy modulates the visual perception of human-like body postures without imitation
Oi, Misato
Ito, Hiroshi
Saito, Hirofumi
Meng, Shuang
Palacios, Victor Alberto
JOURNAL OF COGNITIVE PSYCHOLOGY, 2016, 28 (03) : 319 - 328

← 1 2 3 4 5 →