Towards Coherent Natural Language Description of Video Streams

被引：0

作者：

Khan, Muhammad Usman Ghani ^{[1
]}

Zhang, Lei ^{[2
]}

Gotoh, Yoshihiko ^{[1
]}

机构：

[1] Univ Sheffield, Sheffield, S Yorkshire, England

[2] Harbin Engn Univ, Harbin, Peoples R China

来源：

2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS) | 2011年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This contribution addresses the approach to creating smooth and coherent description of video streams. Firstly conventional image processing techniques are applied to extract high level features from individual video frames. Natural language description of the frame contents is produced based on high level features. In order to extend the approach to description of video streams, we introduce units of features and overview how units can be used to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. The approach is evaluated by calculating overlap similarity score between human authored and machine generated descriptions.

引用

页数：8

共 50 条

[21] Towards a mobile architecture description language
Bouanaka, Chafia
Belala, Faiza
2008 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1-3, 2008, : 743 - 748
[22] Towards a Universal Service Description Language
Simon, L
Bansal, A
Mallya, A
Kona, S
Gupta, G
Hite, TD
International Conference on Next Generation Web Services Practices, 2005, : 175 - 180
[23] Towards a mathematical services description language
Caprotti, O
Schreiner, W
MATHEMATICAL SOFTWARE, PROCEEDINGS, 2002, : 442 - 452
[24] TOWARDS A LANGUAGE OF DESCRIPTION FOR CHANGING PEDAGOGY
Brodie, Karin
PROCEEDINGS OF THE JOINT MEETING OF PME 32 AND PME-NA XXX, VOL 2, 2008, : 209 - 216
[25] Natural Language Description of Human Activities from Video Images Based on Concept Hierarchy of Actions
Atsuhiro Kojima
Takeshi Tamura
Kunio Fukunaga
International Journal of Computer Vision, 2002, 50 : 171 - 184
[26] Natural language description of human activities from video images based on concept hierarchy of actions
Kojima, A
Tamura, T
Fukunaga, K
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2002, 50 (02) : 171 - 184
[27] AlertMe: Towards Natural Language-Based Live Video Trigger Systems at the Edge
Ye, Angela Ning
Hu, Zhiming
Phillips, Caleb
Mohomed, Iqbal
PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON EDGE SYSTEMS, ANALYTICS AND NETWORKING (EDGESYS'21), 2021, : 67 - 72
[28] Person Search with Natural Language Description
Li, Shuang
Xiao, Tong
Li, Hongsheng
Zhou, Bolei
Yue, Dayu
Wang, Xiaogang
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5187 - 5196
[29] Mapping Natural Language to Description Logic
Gyawali, Bikash
Shimorina, Anastasia
Gardent, Claire
Cruz-Lara, Samuel
Mahfoudh, Mariem
SEMANTIC WEB ( ESWC 2017), PT I, 2017, 10249 : 273 - 288
[30] Towards Estimating Video QoE Based on Frame Loss Statistics of the Video Streams
Orosz, Peter
Skopko, Tamas
Varga, Pal
PROCEEDINGS OF THE 2015 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM), 2015, : 1282 - 1285

← 1 2 3 4 5 →