news video story segmentation silence clip shot detection audio-visual fusion

被引：3

作者：

Song, Yu ^{[1
]}

Wang, Wenhong ^{[1
]}

Guo, Fengjuan ^{[2
]}

机构：

[1] North China Elect Power Univ, Dept Comp, Baoding 071003, Peoples R China

[2] North China Elect Power Univ, Sci & Technol Coll, Baoding 071003, Peoples R China

来源：

ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION | 2009年

关键词：

news video; story segmentation; silence clip; shot detection; audio-visual fusion; VIDEO;

D O I：

10.1109/ICCSE.2009.5228544

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

this paper presents a method for news video story segmentation, which fuses multi-feature including audio and visual. At first, this paper detects the anchorperson shot for news video and determines the beginning of news story, and then detects topic caption between anchorperson shots. In the next step, silence clips in news video are detected using short-time energy and short-time average zero-crossing rate parameters, and then voice features of anchorperson is analyzed. At last, this method fuses multi-feature such as anchorperson shot, topic caption, silence and voice feature to segment news stories. Experimental results show that the approach is valid and avoid the deficiency of detecting news story by a single feature.

引用

页码：1065 / +

页数：2

共 50 条

[21] Audio-Visual Video Classification System Design For Arabic News Domain
Dandashi, Amal
AlJaam, Jihad
Foufou, Sebti
2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 745 - 751
[22] Unsupervised video-shot segmentation and model-free, anchorperson detection for news video story parsing
Gao, XB
Tang, X
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2002, 12 (09) : 765 - 776
[23] Temporal Cue Guided Video Highlight Detection with Low-Rank Audio-Visual Fusion
Ye, Qinghao
Shen, Xiyue
Gao, Yuan
Wang, Zirui
Bi, Qi
Li, Ping
Yang, Guang
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7930 - 7939
[24] Spiking Tucker Fusion Transformer for Audio-Visual Zero-Shot Learning
Li, Wenrui
Wang, Penghong
Xiong, Ruiqin
Fan, Xiaopeng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4840 - 4852
[25] Audio-visual large-scale video copy detection
Liu, Yang
Xu, Changsheng
Lu, Hanqing
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2011, 88 (18) : 3803 - 3816
[26] Discovering joint audio-visual codewords for video event detection
Jhuo, I-Hong
Ye, Guangnan
Gao, Shenghua
Liu, Dong
Jiang, Yu-Gang
Lee, D. T.
Chang, Shih-Fu
MACHINE VISION AND APPLICATIONS, 2014, 25 (01) : 33 - 47
[27] Story segmentation and detection of commercials in broadcast news video
Hauptmann, AG
Witbrock, MJ
IEEE INTERNATIONAL FORUM ON RESEARCH AND TECHNOLOGY ADVANCES IN DIGITAL LIBRARIES -ADL'98-, PROCEEDINGS, 1998, : 168 - 179
[28] New approaches to audio-visual segmentation of TV news for automatic topic retrieval
Iurgel, U
Meermeier, R
Eickeler, S
Rigoll, G
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 1397 - 1400
[29] AVSegFormer: Audio-Visual Segmentation with Transformer
Gao, Shengyi
Chen, Zhe
Chen, Guo
Wang, Wenhai
Lu, Tong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12155 - 12163
[30] Decision-Level Fusion for Audio-Visual Laughter Detection
Reuderink, Boris
Poel, Alannes
Truong, Khiet
Poppe, Ronald
Pantic, Maja
MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 137 - 148

← 1 2 3 4 5 →