Audio-Based Semantic Concept Classification for Consumer Video

被引:57
|
作者
Lee, Keansub [1 ]
Ellis, Daniel P. W. [1 ]
机构
[1] Columbia Univ, Dept Elect Engn, Lab Recognit & Org Speech & Audio LabROSA, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Audio classification; consumer video classification; semantic concept detection; soundtrack analysis; RETRIEVAL; MUSIC; SEGMENTATION;
D O I
10.1109/TASL.2009.2034776
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.
引用
收藏
页码:1406 / 1416
页数:11
相关论文
共 50 条
  • [21] Audio-based Classification of Swirl Combustion Regimes Using Deep Learning
    Roy, Rishi
    Gupta, Ashwani K.
    PROCEEDINGS OF ASME POWER APPLIED R&D 2023, POWER2023, 2023,
  • [22] Automatic Audio-Based Classification of Patient Inhaler Use: A Pharmacy Based Study
    McNulty, Johnny
    Reilly, Richard B.
    Taylor, Terence E.
    O'Dwyer, Susan M.
    Costello, Richard W.
    Zigel, Yaniv
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 2606 - 2609
  • [23] An Audio-Based Deep Learning Framework For BBC Television Programme Classification
    Lam Pham
    Baume, Chris
    Kong, Qiuqiang
    Hussain, Tassadaq
    Wang, Wenwu
    Plumbley, Mark
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 56 - 60
  • [24] Audio-based context recognition
    Eronen, AJ
    Peltonen, VT
    Tuomi, JT
    Klapuri, AP
    Fagerlund, S
    Sorsa, T
    Lorho, G
    Huopaniemi, J
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 321 - 329
  • [25] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
    Cheng, Kun
    Cun, Xiaodong
    Zhang, Yong
    Xia, Menghan
    Yin, Fei
    Zhu, Mingrui
    Wang, Xuan
    Wang, Jue
    Wang, Nannan
    PROCEEDINGS SIGGRAPH ASIA 2022, 2022,
  • [26] Video semantic concept discovery using multimodal-based association classification
    Lin, Lin
    Ravitz, Guy
    Shyu, Mei-Ling
    Chen, Shu-Ching
    2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 859 - +
  • [27] VIDEO SEMANTIC CONCEPT DETECTION VIA ASSOCIATIVE CLASSIFICATION
    Lin, Lin
    Shyu, Mei-Ling
    Ravitz, Guy
    Chen, Shu-Ching
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 418 - +
  • [28] A novel fusion method for semantic concept classification in video
    Tan, Li
    Cao, Yuanda
    Yang, Minghua
    Yu, Jiong
    Journal of Software, 2009, 4 (09): : 968 - 975
  • [29] A Large-Scale UAV Audio Dataset and Audio-Based UAV Classification Using CNN
    Wang, Yaqin
    Chu, Zhiwei
    Ku, Ilmun
    Smith, E. Cho
    Matson, Eric T.
    2022 SIXTH IEEE INTERNATIONAL CONFERENCE ON ROBOTIC COMPUTING, IRC, 2022, : 186 - 189
  • [30] Developing an Audio-based Game
    Im, Byoung Uk
    Baek, Nakhoon
    2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,