Audio Content Analysis for Understanding Structures of Scene in Video

被引:0
|
作者
Kang, Chan-Mi [1 ]
Baek, Joong-Hwan [2 ]
机构
[1] Hankuk Aviat Univ, Multimedia Retrieval Lab, Sch Elect & Commun Engn, Seoul, South Korea
[2] Hankuk Aviat Univ, Sch Elect & Commun Engn, Seoul, South Korea
关键词
D O I
10.1007/11816157_151
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a system to categorize audio in 7 classes. For classification features, we use the mean and variance of RMS, ZCR, fundamental frequency and frequency peak which are extracted from every frame of 25ms length. In addition to the audio content classification, we also perform speaker identification with the voice sequences extracted automatically using our proposed method. The accuracy of our proposed scheme reaches 93.8% in categorizing audio signal and 80% in the speaker identification process.
引用
收藏
页码:1213 / 1218
页数:6
相关论文
共 50 条
  • [1] Scene change detection based on audio and video content analysis
    Zhu, YY
    Zhou, DG
    [J]. ICCIMA 2003: FIFTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2003, : 229 - 234
  • [2] Audio scene segmentation for video with generic content
    Niu, Feng
    Goela, Naveen
    Divakaran, Ajay
    Abdel-Mottaleb, Mohamed
    [J]. MULTIMEDIA CONTENT ACCESS: ALGORITHMS AND SYSTEMS II, 2008, 6820
  • [3] Communication system synchronized with video scene which supports understanding the content of the video
    Shimada, Satoshi
    Kondo, Isao
    Miyakawa, Kazu
    Azuma, Shouzou
    Yonemura, Shunichi
    [J]. IEEJ Transactions on Electronics, Information and Systems, 2011, 131 (03) : 664 - 673
  • [4] Video segmentation with the assistance of audio content analysis
    Jiang, H
    Lin, T
    Zhang, HJ
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1507 - 1510
  • [5] Video scene segmentation using video and audio features
    Sundaram, H
    Chang, SF
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1145 - 1148
  • [6] Instructional video content analysis using audio information
    Li, Ying
    Dorai, Chitra
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (06): : 2264 - 2274
  • [7] Scene and content analysis from multiple video streams
    Guler, S
    [J]. 30TH APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP, PROCEEDINGS: ANALYSIS AND UNDERSTANDING OF TIME VARYING IMAGERY, 2001, : 119 - 125
  • [8] Scene Determination Based on Video and Audio Features
    Silvia Pfeiffer
    Rainer Lienhart
    Wolfgang Efflsberg
    [J]. Multimedia Tools and Applications, 2001, 15 : 59 - 81
  • [9] Scene determination based on video and audio features
    Lienhart, R
    Pfeiffer, S
    Effelsberg, W
    [J]. IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 1, 1999, : 685 - 690
  • [10] Scene determination based on video and audio features
    Pfeiffer, S
    Lienhart, R
    Efflsberg, W
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2001, 15 (01) : 59 - 81