Towards Coherent Natural Language Description of Video Streams

被引:0
|
作者
Khan, Muhammad Usman Ghani [1 ]
Zhang, Lei [2 ]
Gotoh, Yoshihiko [1 ]
机构
[1] Univ Sheffield, Sheffield, S Yorkshire, England
[2] Harbin Engn Univ, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This contribution addresses the approach to creating smooth and coherent description of video streams. Firstly conventional image processing techniques are applied to extract high level features from individual video frames. Natural language description of the frame contents is produced based on high level features. In order to extend the approach to description of video streams, we introduce units of features and overview how units can be used to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. The approach is evaluated by calculating overlap similarity score between human authored and machine generated descriptions.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Spacetime-coherent geometry reconstruction from multiple video streams
    Magnor, M
    Goldlücke, B
    2ND INTERNATIONAL SYMPOSIUM ON 3D DATA PROCESSING, VISUALIZATION, AND TRANSMISSION, PROCEEDINGS, 2004, : 365 - 372
  • [32] Towards a Natural Language Compiler
    Zuniga, Angel
    Sierra, Gerardo
    Bel-Enguix, Gemma
    Galicia-Haro, Sofia N.
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2018, PT II, 2018, 11289 : 70 - 82
  • [33] Towards a language for coherent enterprise architecture descriptions
    Jonkers, H
    van Buuren, R
    Arbab, F
    de Boer, F
    Bonsangue, M
    Bosma, H
    ter Doest, H
    Groenewegen, L
    Scholten, JG
    Hoppenbrouwers, S
    Iacob, ME
    Janssen, W
    Lankhorst, M
    van Leeuwen, D
    Proper, E
    Stam, A
    van der Torre, L
    van Zanten, GV
    SEVENTH IEEE INTERNATIONAL ENTERPRISE DISTRIBUTED OBJECT COMPUTING CONFERENCE, PROCEEDINGS, 2003, : 28 - 37
  • [34] Natural Language Access to Video Databases
    Francis, Danny
    Pidou, Paul
    Merialdo, Bernard
    Huet, Benoit
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 78 - 81
  • [35] Natural language driven video sequencer
    Terebijon Gakkaishi, 10 (1585):
  • [36] Localizing Moments in Video with Natural Language
    Hendricks, Lisa Anne
    Wang, Oliver
    Shechtman, Eli
    Sivic, Josef
    Darrell, Trevor
    Russell, Bryan
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5804 - 5813
  • [37] Natural language querying for video databases
    Erozel, Guzen
    Cicekli, Nihan Kesim
    Cicekli, Ilyas
    INFORMATION SCIENCES, 2008, 178 (12) : 2534 - 2552
  • [38] Automatic Generation of Coherent Natural Language Texts
    Marchenko, Oleksandr
    Isoieva, Mariam
    FLEXIBLE QUERY ANSWERING SYSTEMS, FQAS 2023, 2023, 14113 : 79 - 92
  • [39] Deformable strokes towards temporally coherent video painting
    Krompiec, Przemyslaw
    Park, Kyoungju
    Liang, Dongxue
    Lee, Changmin
    VISUAL COMPUTER, 2016, 32 (6-8): : 813 - 823
  • [40] Deformable strokes towards temporally coherent video painting
    Przemyslaw Krompiec
    Kyoungju Park
    Dongxue Liang
    Changmin Lee
    The Visual Computer, 2016, 32 : 813 - 823