A multi-modal approach to story segmentation for news video

被引:30
|
作者
Chaisorn, L [1 ]
Chua, TS [1 ]
Lee, CH [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
关键词
news story segmentation; shot classification; multi-modal approach; learning-based approach;
D O I
10.1023/A:1023622605600
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This research proposes a two-level, multi-modal framework to perform the segmentation and classification of news video into single-story semantic units. The video is analyzed at the shot and story unit (or scene) levels using a variety of features and techniques. At the shot level, we employ Decision Trees technique to classify the shots into one of 13 predefined categories or mid-level features. At the scene/story level, we perform the HMM (Hidden Markov Models) analysis to locate story boundaries. Our initial results indicate that we could achieve a high accuracy of over 95% for shot classification, and over 89% in F-1 measure on scene/story boundary detection. Detailed analysis reveals that HMM is effective in identifying dominant features, which helps in locating story boundaries. Our eventual goal is to support the retrieval of news video at story unit level, together with associated texts retrieved from related news sites on the web.
引用
收藏
页码:187 / 208
页数:22
相关论文
共 50 条
  • [41] Automated Multi-Modal Video Editing for Ads Video
    Lin, Qin
    Pang, Nuo
    Hong, Zhiying
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4823 - 4827
  • [42] Multi-modal Analysis of Misleading Political News
    Shrestha, Anu
    Spezzano, Francesca
    Gurunathan, Indhumathi
    DISINFORMATION IN OPEN ONLINE MEDIA, MISDOOM 2020, 2020, 12259 : 261 - 276
  • [43] Multi-modal Chinese Fake News Detection
    Huang, Wenxi
    Zhao, Zhangyi
    Chen, Xiaojun
    Li, Mark Junjie
    Zhang, Qin
    Fournier-Viger, Philippe
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 109 - 117
  • [44] Multi-modal classification in digital news libraries
    Chen, MY
    Hauptmann, A
    JCDL 2004: PROCEEDINGS OF THE FOURTH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES: GLOBAL REACH AND DIVERSE IMPACT, 2004, : 212 - 213
  • [45] Multi-modal transformer for fake news detection
    Yang, Pingping
    Ma, Jiachen
    Liu, Yong
    Liu, Meng
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14699 - 14717
  • [46] Sound of Story: Multi-modal Storytelling with Audio
    Bae, Jaeyeon
    Jeong, Seokhoon
    Kong, Seokun
    Han, Namgi
    Lee, Jae-Yon
    Kim, Hyounghun
    Kim, Taehwan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13467 - 13479
  • [47] Multi-Modal Multi-Action Video Recognition
    Shi, Zhensheng
    Liang, Ju
    Li, Qianqian
    Zheng, Haiyong
    Gu, Zhaorui
    Dong, Junyu
    Zheng, Bing
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13658 - 13667
  • [48] A Multi-Modal System for Road Detection and Segmentation
    Hu, Xiao
    Rodriguez F, Sergio A.
    Gepperth, Alexander
    2014 IEEE INTELLIGENT VEHICLES SYMPOSIUM PROCEEDINGS, 2014, : 1365 - 1370
  • [49] PIMMS: Permutation Invariant Multi-modal Segmentation
    Varsavsky, Thomas
    Eaton-Rosen, Zach
    Sudre, Carole H.
    Nachev, Parashkev
    Cardoso, M. Jorge
    DEEP LEARNING IN MEDICAL IMAGE ANALYSIS AND MULTIMODAL LEARNING FOR CLINICAL DECISION SUPPORT, DLMIA 2018, 2018, 11045 : 201 - 209
  • [50] Multi-modal Transformer for Brain Tumor Segmentation
    Cho, Jihoon
    Park, Jinah
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 138 - 148