Towards Coherent Natural Language Description of Video Streams

被引:0
|
作者
Khan, Muhammad Usman Ghani [1 ]
Zhang, Lei [2 ]
Gotoh, Yoshihiko [1 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Harbin Engn Univ, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This contribution addresses the approach to creating smooth and coherent description of video streams. Firstly conventional image processing techniques are applied to extract high level features from individual video frames. Natural language description of the frame contents is produced based on high level features. In order to extend the approach to description of video streams, we introduce units of features and overview how units can be used to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. The approach is evaluated by calculating overlap similarity score between human authored and machine generated descriptions.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] GENERATING COHERENT NATURAL LANGUAGE ANNOTATIONS FOR VIDEO STREAMS
    Khan, Muhammad Usman Ghani
    Zhang, Lei
    Gotoh, Yoshihiko
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 2893 - 2896
  • [2] Natural Language Description of Video Streams Using Task-Specific Feature Encoding
    Dilawari, Aniqa
    Khan, Muhammad Usman Ghani
    Farooq, Ammarah
    Zahoor-Ur-Rehman
    Rho, Seungmin
    Mehmood, Irfan
    IEEE ACCESS, 2018, 6 : 16639 - 16645
  • [3] A framework for creating natural language descriptions of video streams
    Khan, Muhammad Usman Ghani
    Al Harbi, Nouf
    Gotoh, Yoshihiko
    INFORMATION SCIENCES, 2015, 303 : 61 - 82
  • [4] The Role of the Input in Natural Language Video Description
    Cascianelli, Silvia
    Costante, Gabriele
    Devo, Alessandro
    Ciarfuglia, Thomas A.
    Valigi, Paolo
    Fravolini, Mario L.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 271 - 283
  • [5] Video Scene Classification based on Natural Language Description
    Zhang, Lei
    Khan, Muhammad Usman Ghani
    Gotoh, Yoshihiko
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [6] TOWARDS A DESCRIPTION OF ROBOT MOVEMENTS BY A QUASI-NATURAL LANGUAGE
    ADORNI, G
    GAGLIO, S
    ZACCARIA, R
    THEORETICAL LINGUISTICS, 1984, 11 (1-2) : 61 - 85
  • [7] Generating natural language description of human behavior from video images
    Kojima, A
    Izumi, M
    Tamura, T
    Fukunaga, K
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 728 - 731
  • [8] Towards a description for video indexation
    Lebourgeois, F
    Jolion, JM
    Awart, PC
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 912 - 915
  • [9] Full-GRU Natural Language Video Description for Service Robotics Applications
    Cascianelli, Silvia
    Costante, Gabriele
    Ciarfuglia, Thomas A.
    Valigi, Paolo
    Fravolini, Mario L.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (02): : 841 - 848
  • [10] Be in all its states Towards a conceptual description of the verb be in the definition of natural language
    Sambre, Paul
    COGNITEXTES, 2007, 1