Large-Lexicon Attribute-Consistent Text Recognition in Natural Images

被引：59

作者：

Novikova, Tatiana ^{[1
]}

Barinova, Olga ^{[1
]}

Kohli, Pushmeet ^{[2
]}

Lempitsky, Victor ^{[3
]}

机构：

[1] Moscow MV Lomonosov State Univ, Moscow 117234, Russia

[2] Microsoft Res Cambridge, Cambridge, England

[3] Yandex, Moscow, Russia

来源：

COMPUTER VISION - ECCV 2012, PT VI | 2012年 / 7577卷

关键词：

D O I：

10.1007/978-3-642-33783-3_54

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a new model for the task of word recognition in natural images that simultaneously models visual and lexicon consistency of words in a single probabilistic model. Our approach combines local likelihood and pairwise positional consistency priors with higher order priors that enforce consistency of characters (lexicon) and their attributes (font and colour). Unlike traditional stage-based methods, word recognition in our framework is performed by estimating the maximum a posteriori (MAP) solution under the joint posterior distribution of the model. MAP inference in our model is performed through the use of weighted finite-state transducers (WFSTs). We show how the efficiency of certain operations on WFSTs can be utilized to find the most likely word under the model in an efficient manner. We evaluate our method on a range of challenging datasets (ICDAR'03, SVT, ICDAR'11). Experimental results demonstrate that our method outperforms state-of-the-art methods for cropped word recognition.

引用

页码：752 / 765

页数：14