A comparison of character n-grams and dictionaries used for script recognition

被引：3

作者：

Brakensiek, A ^{[1
]}

Rigoll, G ^{[1
]}

机构：

[1] Univ Duisburg Gesamthsch, Fac Elect Engn, Dept Comp Sci, D-47057 Duisburg, Germany

来源：

SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS | 2001年

关键词：

D O I：

10.1109/ICDAR.2001.953791

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper an off-line script recognition system is described, which makes use of a language model, that consists of backoff character n-grams. The performance of this open vocabulary recognition is compared with the use of closed dictionaries. The system is based on Hidden Markov Models (HMMs) using a hybrid modeling technique, which depends on a neural vector quantizer The presented recognition results refer to the SEDAL-database of degraded English documents such as photocopy or fax and a writer-dependent handwritten database of cursive German script samples. Our resulting system for character recognition yields significantly better recognition results for an unlimited vocabulary using language models.

引用

页码：241 / 245

页数：3

共 50 条

[1] Unconstrained Offline Handwriting Recognition using Connectionist Character N-grams
Zamora-Martinez, F.
Castro-Bleda, M. J.
Espana-Boquera, S.
Gorbe-Moya, J.
2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
[2] Comparing word, character, and phoneme n-grams for subjective utterance recognition
Wilson, Theresa
Raaijmakers, Stephan
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1614 - +
[3] Handwritten address recognition with open vocabulary using character n-grams
Brakensiek, A
Rottland, J
Rigoll, G
EIGHTH INTERNATIONAL WORKSHOP ON FRONTIERS IN HANDWRITING RECOGNITION: PROCEEDINGS, 2002, : 357 - 362
[4] Statistical Analysis of the Indus Script Using n-Grams
Yadav, Nisha
Joglekar, Hrishikesh
Rao, Rajesh P. N.
Vahia, Mayank N.
Adhikari, Ronojoy
Mahadevan, Iravatham
PLOS ONE, 2010, 5 (03):
[5] Detection of Opinion Spam with Character n-grams
Hernandez Fusilier, Donato
Montes-y-Gomez, Manuel
Rosso, Paolo
Guzman Cabrera, Rafael
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 285 - 294
[6] Spam detection using character N-grams
Kanaris, Ioannis
Kanaris, Konstantinos
Stamatatos, Efstathios
ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 3955 : 95 - 104
[7] Improved degraded document recognition with hybrid modeling techniques and character n-grams
Brakensiek, A
Willett, D
Rigoll, G
15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 438 - 441
[8] Which Granularity to Bootstrap a Multilingual Method of Document Alignment: Character N-grams or Word N-grams?
Lecluze, Charlotte
Rigouste, Lois
Giguet, Emmanuel
Lucas, Nadine
CORPUS RESOURCES FOR DESCRIPTIVE AND APPLIED STUDIES. CURRENT CHALLENGES AND FUTURE DIRECTIONS: SELECTED PAPERS FROM THE 5TH INTERNATIONAL CONFERENCE ON CORPUS LINGUISTICS (CILC2013), 2013, 95 : 473 - 481
[9] Mining generalized character n-grams in large corpora
Marques, Nuno C.
Braud, Agnès
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2902 : 419 - 423
[10] Character N-Grams for Detecting Deceptive Controversial Opinions
Sanchez-Junquera, Javier
Villasenor-Pineda, Luis
Montes-y-Gomez, Manuel
Rosso, Paolo
EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION (CLEF 2018), 2018, 11018 : 135 - 140

← 1 2 3 4 5 →