A dynamic programming approach to audio segmentation and speech/music discrimination

被引:0
|
作者
Goodwin, MM [1 ]
Laroche, J [1 ]
机构
[1] Creat Adv Technol Ctr, Scotts Valley, CA USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We consider the problem of segmenting an audio signal into characteristic regions based on feature-set similarities. In the proposed approach, a feature-space representation of the signal is generated; sequences of these feature-space samples are then aggregated into clusters corresponding to distinct signal regions. The algorithm consists of using linear discriminant analysis (LDA) to condition the feature space and dynamic programming (DP) to identify data clusters. In this paper, we consider the design of the dynamic program cost functions; we are able to derive effective cost functions without relying on significant prior information about the structure of the expected data clusters. We demonstrate the application of the LDA-DP segmentation algorithm to speech/music discrimination; experimental results are given and discussed.
引用
收藏
页码:309 / 312
页数:4
相关论文
共 50 条
  • [1] Speech/Music Discrimination in Audio Podcast Using Structural Segmentation and Timbre Recognition
    Barthet, Mathieu
    Hargreaves, Steven
    Sandler, Mark
    [J]. EXPLORING MUSIC CONTENTS, 2011, 6684 : 138 - 162
  • [2] Audio coding improvement using evolutionary speech/music discrimination
    Exposito, J. E. Munoz
    Galan, S. Garcia
    Reyes, N. Ruiz
    Candeas, R. Vera
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 822 - 827
  • [3] A fast and robust speech/music discrimination approach
    Wang, WQ
    Gao, W
    Ying, DW
    [J]. ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1325 - 1329
  • [4] Integration of Speech/Music Discrimination and Mood Classification with Audio Feature Extraction
    Ashraf, Mohsin
    Geng Guohua
    Wang, Xiaofeng
    Ahmad, Farooq
    [J]. 2018 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2018), 2018, : 224 - 229
  • [5] Expert system for intelligent audio codification based in speech/music discrimination
    Exposito, J. E. Munoz
    Galan, S. Garcia
    Reyes, N. Ruiz
    Candeas, P. Vera
    Pena, F. Rivas
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON EVOLVING FUZZY SYSTEMS, PROCEEDINGS, 2006, : 318 - +
  • [6] Simultaneous speech segmentation and phoneme recognition using dynamic programming
    Bajwa, RS
    Owens, RM
    Kelliher, TP
    [J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
  • [7] Speech/music discrimination-based audio characterization using blind watermarking scheme
    Mezghani, Eya
    Charfeddine, Maha
    Nicolas, Henri
    Ben Amar, Chokri
    [J]. JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2016, 11 (06): : 311 - 321
  • [8] SPEECH/MUSIC DISCRIMINATION BASED ON WARPING TRANSFORMATION AND FUZZY LOGIC FOR INTELLIGENT AUDIO CODING
    Enrique Munoz-Exposito, Jose
    Garcia Galan, Sebastian
    Ruiz Reyes, Nicolas
    Vera Candeas, Pedro
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2009, 23 (05) : 427 - 442
  • [9] An RNN-Based Speech-Music Discrimination Used for Hybrid Audio Coder
    Yang, Wanzhao
    Tu, Weiping
    Zheng, Jiaxi
    Zhang, Xiong
    Yang, Yuhong
    Song, Yucheng
    [J]. MULTIMEDIA MODELING, MMM 2018, PT I, 2018, 10704 : 81 - 92
  • [10] Combined speech and audio coding by discrimination
    Tancerel, L
    Ragot, S
    Ruoppila, VT
    Lefebvre, R
    [J]. 2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS: MEETING THE CHALLENGES OF THE NEW MILLENNIUM, 2000, : 154 - 156