A spatial-temporal approach for video caption detection and recognition

被引：86

作者：

Tang, X ^{[1
]}

Gao, XB

Liu, JZ

Zhang, HJ

机构：

[1] Chinese Univ Hong Kong, Dept Informat Engn, Shatin, Hong Kong, Peoples R China

[2] Microsoft Res Asia, Beijing 100080, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 2002年 / 13卷 / 04期

关键词：

Chinese caption detection; fuzzy clustering neural networks (FCNNs); video indexing; video OCR; video shot segmentation;

D O I：

10.1109/TNN.2002.1021896

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.

引用

页码：961 / 971

页数：11

共 50 条

[1] Exploiting Spatial-temporal Correlations for Video Anomaly Detection
Zhao, Mengyang
Liu, Yang
Liu, Jing
Zeng, Xinhua
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1727 - 1733
[2] Video Object Detection with an Aligned Spatial-Temporal Memory
Xiao, Fanyi
Lee, Yong Jae
[J]. COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 494 - 510
[3] Spatial-temporal Activity Interactions Detection in Video Survalliance
Fan, Yawen
Zheng, Shibao
[J]. 2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), 2013, : 432 - 435
[4] Model-based approach to spatial-temporal sampling of video clips for video object detection by classification
Chuang, Chi-Han
Cheng, Shyi-Chyi
Chang, Chin-Chun
Chen, Yi-Ping Phoebe
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (05) : 1018 - 1030
[5] Caption Detection and Text Recognition in News Video
Yang, Zhe
Shi, Ping
[J]. 2012 5TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), 2012, : 188 - 191
[6] Slow Video Detection Based on Spatial-Temporal Feature Representation
Ma, Jianyu
Yao, Haichao
Ni, Rongrong
Zhao, Yao
[J]. PATTERN RECOGNITION AND COMPUTER VISION,, PT III, 2021, 13021 : 298 - 309
[7] ISTVT: Interpretable Spatial-Temporal Video Transformer for Deepfake Detection
Zhao, Cairong
Wang, Chutian
Hu, Guosheng
Chen, Haonan
Liu, Chun
Tang, Jinhui
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 1335 - 1348
[8] Spatial-Temporal Structural and Dynamics Features for Video Fire Detection
Wang, Hongcheng
Finn, Alan
Erdinc, Ozgur
Vincitore, Antonio
[J]. 2013 IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION (WACV), 2013, : 513 - 519
[9] An Efficient Spatial-Temporal Polyp Detection Framework for Colonoscopy Video
Zhang, Pengfei
Sun, Xinzi
Wang, Dechun
Wang, Xizhe
Cao, Yu
Liu, Benyuan
[J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1252 - 1259
[10] Spatial-temporal graph attention network for video anomaly detection
Chen, Haoyang
Mei, Xue
Ma, Zhiyuan
Wu, Xinhong
Wei, Yachuan
[J]. IMAGE AND VISION COMPUTING, 2023, 131

← 1 2 3 4 5 →