Integrating audio-visual features and text information for story segmentation of news video

被引:0
|
作者
Liu, Hua-Yong [1 ]
Zhou, Dong-Ru [1 ]
机构
[1] Sch. of Comp., Wuhan Univ., Wuhan 430072, China
关键词
School of Computer; Wuhan University; Wuhan; 430072; Hubei; China Abstract: Video data are composed of multimodal information streams including visual; auditory and textual streams; so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames; and integrates them with silence clips detection results; as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames; when the boundaries between news stories are detected; the accuracy rate 85.8~ and the recall rate 97.5~ are obtained. The experimental results show the approach is valid and robust. Key words: news video; story segmentation; audio-visual features analysis; text detection CLC number: TP 311. 5 Received date: 2002-12-23 Foundation item: Supported by the Nanonal Natural Science Foundation of China (60173045) Biogi~phg: Liu Hua-yong (1978-); male; Ph.D; can&date; research direetton:vldeo retneval and speech ~ignal processing. E-mad: hyhut9 _en@ sina. corn 1 To whom correspondence should be addressed;
D O I
10.1007/bf02903674
中图分类号
学科分类号
摘要
8
引用
收藏
页码:1070 / 1074
相关论文
共 50 条
  • [11] Automated generation of news content hierarchy by integrating audio, video, and text information
    Huang, Q
    Liu, Z
    Rosenberg, A
    Gibbon, D
    Shahraray, B
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 3025 - 3028
  • [12] Audio-visual speaker recognition for video broadcast news
    Maison, B
    Neti, C
    Senior, A
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2001, 29 (1-2): : 71 - 79
  • [13] Audio-Visual Speaker Recognition for Video Broadcast News
    Benoît Maison
    Chalapathy Neti
    Andrew Senior
    Journal of VLSI signal processing systems for signal, image and video technology, 2001, 29 : 71 - 79
  • [14] Extracting semantic information from basketball video based on audio-visual features
    Kim, K
    Choi, J
    Kim, N
    Kim, P
    IMAGE AND VIDEO RETRIEVAL, 2002, 2383 : 278 - 288
  • [15] Audio-Visual Segmentation
    Zhou, Jinxing
    Wang, Jianyuan
    Zhang, Jiayi
    Sun, Weixuan
    Zhang, Jing
    Birchfield, Stan
    Guo, Dan
    Kong, Lingpeng
    Wang, Meng
    Zhong, Yiran
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 386 - 403
  • [16] VIDEO CAMERA IDENTIFICATION USING AUDIO-VISUAL FEATURES
    Milani, S.
    Cuccovillo, L.
    Tagliasacchi, M.
    Tubaro, S.
    Aichroth, P.
    2014 5TH EUROPEAN WORKSHOP ON VISUAL INFORMATION PROCESSING (EUVIP 2014), 2014,
  • [17] BAVS: Bootstrapping Audio-Visual Segmentation by Integrating Foundation Knowledge
    Liu, Chen
    Li, Peike
    Zhang, Hu
    Li, Lincheng
    Huang, Zi
    Wang, Dadong
    Yu, Xin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10015 - 10028
  • [18] Identification of story units in audio-visual sequences by joint audio and video processing
    Saraceno, C
    Leonardi, R
    1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, : 363 - 367
  • [19] Audio-Visual Segmentation with Semantics
    Zhou, Jinxing
    Shen, Xuyang
    Wang, Jianyuan
    Zhang, Jiayi
    Sun, Weixuan
    Zhang, Jing
    Birchfield, Stan
    Guo, Dan
    Kong, Lingpeng
    Wang, Meng
    Zhong, Yiran
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 1644 - 1664
  • [20] Integration of audio and video semantic features for news video scene segmentation
    Xu, J
    Liu, HB
    Zhou, DR
    VISUALIZATION AND OPTIMIZATION TECHNIQUES, 2001, 4553 : 227 - 232