SPEECH SHOT EXTRACTION FROM BROADCAST NEWS VIDEOS

被引:4
|
作者
Kumagai, Shogo [1 ,5 ]
Doman, Keisuke [1 ,4 ]
Takahashi, Tomokazu [2 ]
Deguchi, Daisuke [3 ]
Ide, Ichiro [1 ]
Murase, Hiroshi [1 ]
机构
[1] Nagoya Univ, Grad Sch Informat Sci, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648601, Japan
[2] Gifu Shotoku Gakuen Univ, Fac Econ & Informat, Gifu 5008288, Japan
[3] Nagoya Univ, Informat & Commun Headquarters, Chikusa Ku, Nagoya, Aichi 4648601, Japan
[4] Japan Soc Promot Sci, Tokyo, Japan
[5] Ricoh Co Ltd, Tokyo, Japan
关键词
Speech shot extraction; audio-visual integration; broadcast news videos;
D O I
10.1142/S1793351X12400077
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a method for discriminating between a speech shot and a narrated shot to extract genuine speech shots from a broadcast news video. Speech shots in news videos contain a wealth of multimedia information of the speaker, and could thus be considered valuable as archived material. In order to extract speech shots from news videos, there is an approach that uses the position and size of a face region. However, it is difficult to extract them with only such an approach, since news videos contain non-speech shots where the speaker is not the subject that appears in the screen, namely, narrated shots. To solve this problem, we propose a method to discriminate between a speech shot and a narrated shot in two stages. The first stage of the proposed method directly evaluates the inconsistency between a subject and a speaker based on the co-occurrence between lip motion and voice. The second stage of the proposed method evaluates based on the intra-and inter-shot features that focus on the tendency of speech shots. With the combination of both stages, the proposed method accurately discriminates between a speech shot and a narrated shot. In the experiments, the overall accuracy of speech shots extraction by the proposed method was 0.871. Therefore, we confirmed the effectiveness of the proposed method.
引用
收藏
页码:179 / 204
页数:26
相关论文
共 50 条
  • [1] A Novel Algorithm for Shot Boundary Detection and Segmentation of Broadcast News Videos
    Haloi, Pranabjyoti
    Bhuyan, M. K.
    Borah, Pooja Rani
    Chatterjee, Dibyajyoti
    [J]. 2020 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2020), 2020, : 751 - 756
  • [2] Topic extraction based on continuous speech recognition in broadcast news speech
    Ohtsuki, K
    Matsuoka, T
    Matsunaga, S
    Furui, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (07) : 1138 - 1144
  • [3] Information extraction from broadcast news
    Gotoh, Y
    Renals, S
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2000, 358 (1769): : 1295 - 1309
  • [4] Topic extraction based on continuous speech recognition in broadcast-news speech
    Ohtsuki, K
    Matsunaga, S
    Matsuoka, T
    Furui, S
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 527 - 534
  • [5] Information extraction from broadcast news - Discussion
    Taylor, PA
    Renals, S
    Pereira, F
    Poznanski, V
    Huckvale, M
    Sabin, M
    [J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2000, 358 (1769): : 1309 - 1310
  • [6] Topic extraction with multiple topic-words in broadcast-news speech
    Ohtsuki, K
    Matsutoka, T
    Matsunaga, S
    Furui, S
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 329 - 332
  • [7] Extraction of Positional Player Data from Broadcast Soccer Videos
    Theiner, Jonas
    Gritz, Wolfgang
    Mueller-Budack, Eric
    Rein, Robert
    Memmert, Daniel
    Ewerth, Ralph
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1463 - 1473
  • [8] Overlay Text Extraction From TV News Broadcast
    Kannao, Raghvendra
    Guha, Prithwijit
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [9] Spoken information extraction from Italian broadcast news
    Sandrini, V
    Federico, M
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2003, 2633 : 146 - 160
  • [10] Connectionist speech recognition of Broadcast News
    Robinson, AJ
    Cook, GD
    Ellis, DPW
    Fosler-Lussier, E
    Renals, SJ
    Williams, DAG
    [J]. SPEECH COMMUNICATION, 2002, 37 (1-2) : 27 - 45