Audio Feature Extraction and Analysis for Scene Segmentation and Classification

被引:0
|
作者
Zhu Liu
Yao Wang
Tsuhan Chen
机构
[1] Polytechnic University,
[2] Carnegie Mellon University,undefined
关键词
Audio Signal; Audio Feature; Scene Change; Football Game; Audio Clip;
D O I
暂无
中图分类号
学科分类号
摘要
Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prior work, we have focused on using the associated audio information (mainly the nonspeech portion) for video scene analysis. As an example, we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. A set of low-level audio features are proposed for characterizing semantic contents of short audio clips. The linear separability of different classes under the proposed feature space is examined using a clustering analysis. The effective features are identified by evaluating the intracluster and intercluster scattering matrices of the feature space. Using these features, a neural net classifier was successful in separating the above five types of TV programs. By evaluating the changes between the feature vectors of adjacent clips, we also can identify scene breaks in an audio sequence quite accurately. These results demonstrate the capability of the proposed audio features for characterizing the semantic content of an audio sequence.
引用
收藏
页码:61 / 79
页数:18
相关论文
共 50 条
  • [21] Acoustic Scene Classification Using Deep Audio Feature and BLSTM Network
    Li, Yanxiong
    Li, Xianku
    Zhang, Yuhan
    Wang, Wucheng
    Liu, Mingle
    Feng, Xiaohui
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 371 - 374
  • [22] Feature Extraction with Convolutional Restricted Boltzmann Machine for Audio Classification
    Li, Min
    Miao, Zhenjiang
    Ma, Cong
    [J]. PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 791 - 795
  • [23] Automated Scene Analysis by Image Feature Extraction
    Gilani, Syed Omer
    Jamil, Mohsin
    Fazal, Zahra
    Naveed, Muhammad Samran
    Sakina, Rabiel
    [J]. 2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 530 - 536
  • [24] Feature Extraction of Surround Sound Recordings for Acoustic Scene Classification
    Zielinski, Slawomir K.
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2018), PT II, 2018, 10842 : 475 - 486
  • [25] REVERBERATION-BASED FEATURE EXTRACTION FOR ACOUSTIC SCENE CLASSIFICATION
    Markovic, Milos
    Geiger, Juergen
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 781 - 785
  • [26] Novel Methods for Microglia Segmentation, Feature Extraction, and Classification
    Ding, Yuchun
    Pardon, Marie Christine
    Agostini, Alessandra
    Faas, Henryk
    Duan, Jinming
    Ward, Wil O. C.
    Easton, Felicity
    Auer, Dorothee
    Bai, Li
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (06) : 1366 - 1377
  • [27] Continuous bangla speech segmentation, classification and feature extraction
    Rahman, Md. Mijanur
    Khan, Md. Farukuzzaman
    Bhuiyan, Md. Al-Amin
    [J]. International Journal of Computer Science Issues, 2012, 9 (02): : 67 - 75
  • [28] Segmentation, Feature Extraction, and Multiclass Brain Tumor Classification
    Jainy Sachdeva
    Vinod Kumar
    Indra Gupta
    Niranjan Khandelwal
    Chirag Kamal Ahuja
    [J]. Journal of Digital Imaging, 2013, 26 : 1141 - 1150
  • [29] Segmentation, Feature Extraction, and Multiclass Brain Tumor Classification
    Sachdeva, Jainy
    Kumar, Vinod
    Gupta, Indra
    Khandelwal, Niranjan
    Ahuja, Chirag Kamal
    [J]. JOURNAL OF DIGITAL IMAGING, 2013, 26 (06) : 1141 - 1150
  • [30] Accurate Graph-Based Scene Segmentation Using Object Matching and Audio Feature
    Yamamoto, Makoto
    Haseyama, Miki
    [J]. ISCE: 2009 IEEE 13TH INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS, VOLS 1 AND 2, 2009, : 670 - 671