A graph-based solution for writer identification from handwritten text

被引：2

作者：

Rahman, Atta Ur ^{[1
]}

Halim, Zahid ^{[1
]}

机构：

[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Fac Comp Sci & Engn, Machine Intelligence Res Grp MInG, Topi, Pakistan

来源：

KNOWLEDGE AND INFORMATION SYSTEMS | 2022年 / 64卷 / 06期

关键词：

Writer identification; Preprocessing; Graph-based representation; Feature extraction; Ensemble learning; INDIVIDUALITY; CODEBOOK; FEATURES;

D O I：

10.1007/s10115-022-01676-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Writer identification is an active research problem due to its applications in forensic and historic documents analysis. It is challenging to identify a writer from her handwritten characters' shapes produced via practiced writing style. Different writing shapes, styles, orientations, various sizes of characters, complex structures, inconsistency, and cursive nature of the text make it a tougher undertaking. To solve this problem, we need to explore a structural representation and spatial information of the handwritten characters. For this, a novel graph-based approach is proposed here to spatially map the handwritten text, adapt its structure, size, and explore the relationship that exist between them. First, image processing steps such as binarization, baseline correction, separation of the writing region, and thinning of the strokes to a width of a single pixel are executed. This work presents a novel algorithm for detecting key points (KPs) in a handwritten skeleton image and extracting their two-dimensional pixel coordinates values. The handwriting samples are then transformed into a graph-based representation with KPs representing nodes and the line segments connecting adjacent KPs as the edges. Features are extracted from the graph-based representations of the handwritten text. For classification, ensemble learning approaches are employed. Four benchmark datasets and one custom collected dataset are utilized for experimentations. The proposed solution achieves identification accuracies of 98.26%, 98.84%, 99.67%, 98.51%, and 97.73%, on CERUG-EN, CVL, Firemaker, IAM, and custom datasets, respectively.

引用

页码：1501 / 1523

页数：23

共 50 条

[21] Graph-based Arabic text semantic representation
Etaiwi, Wael
Awajan, Arafat
INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (03)
[22] Graph-Based Term Weighting for Text Categorization
Malliaros, Fragkiskos D.
Skianis, Konstantinos
PROCEEDINGS OF THE 2015 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2015), 2015, : 1473 - 1479
[23] Graph-based abstractive biomedical text summarization
Givchi, Azadeh
Ramezani, Reza
Baraani-Dastjerdi, Ahmad
JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 132
[24] Graph-based Text Representation and Knowledge Discovery
Jin, Wei
Srihari, Rohini K.
APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 807 - 811
[25] A text-independent Persian writer identification based on feature relation graph (FRG)
Helli, Behzad
Moghaddam, Mohsen Ebrahimi
PATTERN RECOGNITION, 2010, 43 (06) : 2199 - 2209
[26] A graph-based method to remove interferential curve from text image
Cheng, Zhiguo
Liu, Yuncai
MACHINE VISION AND APPLICATIONS, 2006, 17 (04) : 219 - 228
[27] A Graph-based Method to Remove Interferential Curve From Text Image
Zhiguo Cheng
Yuncai Liu
Machine Vision and Applications, 2006, 17 : 219 - 228
[28] Handwritten text recognition through writer adaptation
Nosary, A
Paquet, T
Heutte, L
Bensefia, A
EIGHTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION: PROCEEDINGS, 2002, : 363 - 368
[29] Graph-based Text Classification by Contrastive Learning with Text-level Graph Augmentation
Li, Ximing
Wang, Bing
Wang, Yang
Wang, Meng
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (04)
[30] Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
Bhat, Mohammad Idrees
Sharada, B.
JOURNAL OF INTELLIGENT SYSTEMS, 2020, 29 (01) : 799 - 813

← 1 2 3 4 5 →