Improving multimedia retrieval with a video OCR

被引:0
|
作者
Das, Dipanjan [1 ]
Chen, Datong [2 ]
Hauptmann, Alexander G. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
关键词
video OCR; OCR; multimedia retrieval; video retrieval; optical character recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a set of experiments with a video OCR system (VOCR) tailored for video information retrieval and establish its importance in multimedia search in general and for some specific queries in particular. The system, inspired by an existing work on text detection and recognition in images, has been developed using, techniques involving detailed analysis of video frames producing candidate text regions. The text regions are then binarized and sent to a commercial OCR resulting in ASCII text, that is finally used to create search indexes. The system is evaluated using the TREVID data.. We compare the system's performance from an information retrieval perspective with another VOCR developed, using multi-frame integration and empirically demonstrate that deep analysis on individual video frames result in better video retrieval. We also evaluate the effect of various textual sources on multimedia retrieval by combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general search queries, the VOCR system coupled with ASR sources outperforms the other system by a very large extent. For search queries that involve named entities, especially people names, the VOCR system even outperforms speech transcripts, demonstrating that source selection for particular query types is extremely essential.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Semantic retrieval of multimedia
    Androutsos, D
    Guan, L
    Venetsanopoulos, AN
    IEEE SIGNAL PROCESSING MAGAZINE, 2006, 23 (02) : 14 - 16
  • [42] Multimedia Information Retrieval
    Rueger, Stefan
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 906 - 906
  • [43] XML multimedia retrieval
    Kong, Zhigang
    Lalmas, Mounia
    String Processing and Information Retrieval, Proceedings, 2005, 3772 : 218 - 223
  • [44] Aspects of multimedia retrieval
    AbdelMottaleb, M
    Wu, HL
    Dimitrova, N
    PHILIPS JOURNAL OF RESEARCH, 1996, 50 (1-2) : 227 - 251
  • [45] Multimedia document retrieval
    Ozkarahan, Esen, 1600, Pergamon Press Inc, Tarrytown, NY, United States (31):
  • [46] Multimedia Retrieval that Works
    Aygun, Ramazan S.
    Benesova, Wanda
    IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 63 - 68
  • [47] Multimedia information retrieval
    Lay, JA
    Muneesawang, P
    Guan, L
    CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING 2001, VOLS I AND II, CONFERENCE PROCEEDINGS, 2001, : 619 - 624
  • [48] Multimedia Retrieval in and for XR
    Pegia, Maria
    Diplaris, Sotiris
    Vrochidis, Stefanos
    Schuldt, Heiko
    Spiess, Florian
    Arnold, Rahel
    Bailer, Werner
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1324 - 1325
  • [49] Retrieval in multimedia presentations
    Augusto Celentano
    Ombretta Gaggi
    Maria Luisa Sapino
    Multimedia Systems, 2004, 10 : 72 - 82