A spatial-temporal approach for video caption detection and recognition

被引:86
|
作者
Tang, X [1 ]
Gao, XB
Liu, JZ
Zhang, HJ
机构
[1] Chinese Univ Hong Kong, Dept Informat Engn, Shatin, Hong Kong, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2002年 / 13卷 / 04期
关键词
Chinese caption detection; fuzzy clustering neural networks (FCNNs); video indexing; video OCR; video shot segmentation;
D O I
10.1109/TNN.2002.1021896
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.
引用
收藏
页码:961 / 971
页数:11
相关论文
共 50 条
  • [41] Optimum Video Subset and Spatial-Temporal Video Retrieval
    Wang M.-Z.
    Liu X.-J.
    Sun K.-X.
    Wang Z.-R.
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (09): : 2004 - 2023
  • [42] Spatial-Temporal Attention for Action Recognition
    Sun, Dengdi
    Wu, Hanqing
    Ding, Zhuanlian
    Luo, Bin
    Tang, Jin
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 854 - 864
  • [43] Spatial-temporal decorrelation for image/video coding
    Wang, Miaohui
    Ngan, King Ngi
    Xu, Long
    [J]. 2012 PICTURE CODING SYMPOSIUM (PCS), 2012, : 201 - 204
  • [44] SPATIAL-TEMPORAL ATTENTION ANALYSIS FOR HOME VIDEO
    Qiu, Xuekan
    Jiang, Shuqiang
    Liu, Huiying
    Huang, Qingming
    Cao, Longbing
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 1517 - +
  • [45] Video summarization by spatial-temporal graph optimization
    Lu, S
    Lyu, MR
    King, I
    [J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 2, PROCEEDINGS, 2004, : 197 - 200
  • [46] Collaborative spatial-temporal video salient object detection with cross attention transformer
    Su, Yuting
    Wang, Weikang
    Liu, Jing
    Jing, Peiguang
    [J]. SIGNAL PROCESSING, 2024, 224
  • [47] Learning spatial-temporal features for video copy detection by the combination of CNN and RNN
    Hu, Yaocong
    Lu, Xiaobo
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 : 21 - 29
  • [48] Memory-Augmented Spatial-Temporal Consistency Network for Video Anomaly Detection
    Li, Zhangxun
    Zhao, Mengyang
    Zeng, Xinhua
    Wang, Tian
    Pang, Chengxin
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 95 - 107
  • [49] Masked Autoencoders for Spatial-Temporal Relationship in Video-Based Group Activity Recognition
    Yadav, Rajeshwar
    Halder, Raju
    Banda, Gourinath
    [J]. IEEE ACCESS, 2024, 12 : 132084 - 132095
  • [50] Spatial-temporal error detection scheme for video transmission over noisy channels
    Wu, Guan-Lin
    Chien, Shao-Yi
    [J]. ISM 2007: NINTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2007, : 78 - +