Detection and recognition of cursive text from video frames

被引:6
|
作者
Mirza, Ali [1 ]
Zeshan, Ossama [1 ]
Atif, Muhammad [1 ]
Siddiqi, Imran [1 ]
机构
[1] Bahria Univ, Islamabad, Pakistan
关键词
Text detection; Text recognition; Script identification; Deep neural networks (DNNs); Convolutional neural networks (CNNs); Long short-term memory (LSTM) networks; Caption text; Cursive text; ARTIFICIAL URDU TEXT; NATURAL SCENE IMAGE; SCRIPT IDENTIFICATION; NEURAL-NETWORK; HYBRID APPROACH; LOCALIZATION; FEATURES; REPRESENTATION; SEGMENTATION; EXTRACTION;
D O I
10.1186/s13640-020-00523-5
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Textual content appearing in videos represents an interesting index for semantic retrieval of videos (from archives), generation of alerts (live streams), as well as high level applications like opinion mining and content summarization. The key components of such systems require detection and recognition of textual content which also make the subject of our study. This paper presents a comprehensive framework for detection and recognition of textual content in video frames. More specifically, we target cursive scripts taking Urdu text as a case study. Detection of textual regions in video frames is carried out by fine-tuning deep neural networks based object detectors for the specific case of text detection. Script of the detected textual content is identified using convoluational neural networks (CNNs), while for recognition, we propose a UrduNet, a combination of CNNs and long short- term memory (LSTM) networks. A benchmark dataset containing cursive text with more than 13,000 video frame is also developed. A comprehensive series of experiments is carried out reporting an F-measure of 88.3% for detection while a recognition rate of 87%.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Detection and recognition of cursive text from video frames
    Ali Mirza
    Ossama Zeshan
    Muhammad Atif
    Imran Siddiqi
    [J]. EURASIP Journal on Image and Video Processing, 2020
  • [2] Text detection and recognition in images and video frames
    Chen, DT
    Odobez, JM
    Bourlard, H
    [J]. PATTERN RECOGNITION, 2004, 37 (03) : 595 - 608
  • [3] Detection and Recognition of Arabic Text in Video Frames
    Ohyama, Wataru
    Iwata, Seiya
    Wakabayashi, Tetsushi
    Kimura, Fumitaka
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2017), VOL 7, 2017, : 20 - 24
  • [4] Video Scene Text Frames Categorization for Text Detection and Recognition
    Qin, Longfei
    Shivakumara, Palaiahnakote
    Lu, Tong
    Pal, Umapada
    Tan, Chew Lim
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3886 - 3891
  • [5] Optimal Classification Model for Text Detection and Recognition in Video Frames
    Eshwarappa, Laxmikant
    Rajput, G. G.
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2023,
  • [6] Text area detection from video frames
    Chen, XR
    Zhang, HJ
    [J]. ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS, 2001, 2195 : 222 - 228
  • [7] Recognition of cursive video text using a deep learning framework
    Mirza, Ali
    Siddiqi, Imran
    [J]. IET IMAGE PROCESSING, 2020, 14 (14) : 3444 - 3455
  • [8] Impact of Pre-Processing on Recognition of Cursive Video Text
    Mirza, Ali
    Siddiqi, Imran
    Mustufa, Syed Ghulam
    Hussain, Mazahir
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT I, 2020, 11867 : 565 - 576
  • [9] Fractional poisson enhancement model for text detection and recognition in video frames
    Roy, Sangheeta
    Shivakumara, Palaiahnakote
    Jalab, Hamid A.
    Ibrahim, Rabha W.
    Pal, Umapada
    Lu, Tong
    [J]. PATTERN RECOGNITION, 2016, 52 : 433 - 447
  • [10] Multi-Lingual Text Recognition from Video Frames
    Sharma, Nabin
    Mandal, Ranju
    Sharma, Rabi
    Roy, Partha P.
    Pal, Umapada
    Blumenstein, Michael
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 951 - 955