A multi-modal approach to story segmentation for news video

被引:30
|
作者
Chaisorn, L [1 ]
Chua, TS [1 ]
Lee, CH [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
关键词
news story segmentation; shot classification; multi-modal approach; learning-based approach;
D O I
10.1023/A:1023622605600
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research proposes a two-level, multi-modal framework to perform the segmentation and classification of news video into single-story semantic units. The video is analyzed at the shot and story unit (or scene) levels using a variety of features and techniques. At the shot level, we employ Decision Trees technique to classify the shots into one of 13 predefined categories or mid-level features. At the scene/story level, we perform the HMM (Hidden Markov Models) analysis to locate story boundaries. Our initial results indicate that we could achieve a high accuracy of over 95% for shot classification, and over 89% in F-1 measure on scene/story boundary detection. Detailed analysis reveals that HMM is effective in identifying dominant features, which helps in locating story boundaries. Our eventual goal is to support the retrieval of news video at story unit level, together with associated texts retrieved from related news sites on the web.
引用
收藏
页码:187 / 208
页数:22
相关论文
共 50 条
  • [31] Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
    Zhao, Wangbo
    Wang, Kai
    Chu, Xiangxiang
    Xue, Fuzhao
    Wang, Xinchao
    You, Yang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11727 - 11736
  • [32] Multi-modal Complete Breast Segmentation
    Zolfagharnasab, Hooshiar
    Monteiro, Joao P.
    Teixeira, Joao F.
    Borlinhas, Filipa
    Oliveira, Helder P.
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017), 2017, 10255 : 519 - 527
  • [33] Is Multi-Modal Necessarily Better? Robustness Evaluation of Multi-Modal Fake News Detection
    Chen, Jinyin
    Jia, Chengyu
    Zheng, Haibin
    Chen, Ruoxi
    Fu, Chenbo
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (06): : 3144 - 3158
  • [34] Tencent AVS: A Holistic Ads Video Dataset for Multi-Modal Scene Segmentation
    Jiang, Jie
    Li, Zhimin
    Xiong, Jiangfeng
    Quan, Rongwei
    Lu, Qinglin
    Liu, Wei
    IEEE ACCESS, 2022, 10 : 128959 - 128969
  • [35] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
  • [36] Multi-modal fusion for video understanding
    Hoogs, A
    Mundy, J
    Cross, G
    30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 103 - 108
  • [37] Multi-modal Dense Video Captioning
    Iashin, Vladimir
    Rahtu, Esa
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4117 - 4126
  • [38] FOREGROUND SEGMENTATION FOR STATIC VIDEO VIA MULTI-CORE AND MULTI-MODAL GRAPH CUT
    Chang, Lun-Yu
    Hsu, Winston H.
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1362 - 1365
  • [39] A multi-modal video analysis approach for car park fire detection
    Verstockt, Steven
    Van Hoecke, Sofie
    Beji, Tarek
    Merci, Bart
    Gouverneur, Benedict
    Cetin, A. Enis
    De Potter, Pieterjan
    Van de Walle, Rik
    FIRE SAFETY JOURNAL, 2013, 57 : 44 - 57
  • [40] Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach
    Pini, Stefano
    Cornia, Marcella
    Baraldi, Lorenzo
    Cucchiara, Rita
    IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 : 384 - 395