A dynamic programming approach to audio segmentation and speech/music discrimination

被引:0
|
作者
Goodwin, MM [1 ]
Laroche, J [1 ]
机构
[1] Creat Adv Technol Ctr, Scotts Valley, CA USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We consider the problem of segmenting an audio signal into characteristic regions based on feature-set similarities. In the proposed approach, a feature-space representation of the signal is generated; sequences of these feature-space samples are then aggregated into clusters corresponding to distinct signal regions. The algorithm consists of using linear discriminant analysis (LDA) to condition the feature space and dynamic programming (DP) to identify data clusters. In this paper, we consider the design of the dynamic program cost functions; we are able to derive effective cost functions without relying on significant prior information about the structure of the expected data clusters. We demonstrate the application of the LDA-DP segmentation algorithm to speech/music discrimination; experimental results are given and discussed.
引用
收藏
页码:309 / 312
页数:4
相关论文
共 50 条
  • [21] New speech/music discrimination approach based on warping transformation and ANFIS
    Munoz-Exposito, J. E.
    Ruiz-Reyes, N.
    Garcia-Galan, S.
    Vera-Candeas, P.
    [J]. JOURNAL OF NEW MUSIC RESEARCH, 2006, 35 (03) : 237 - 247
  • [22] ARTIFICIALLY SYNTHESISING DATA FOR AUDIO CLASSIFICATION AND SEGMENTATION TO IMPROVE SPEECH AND MUSIC DETECTION IN RADIO BROADCAST
    Venkatesh, Satvik
    Moffat, David
    Kirke, Alexis
    Shakeri, Gozel
    Brewster, Stephen
    Fachner, Jorg
    Odell-Miller, Helen
    Street, Alex
    Farina, Nicolas
    Banerjee, Sube
    Miranda, Eduardo Reck
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 636 - 640
  • [23] New speech/music discrimination approach based on fundamental frequency estimation
    Ruiz-Reyes, N.
    Vera-Candeas, P.
    Munoz, J. E.
    Garcia-Galan, S.
    Canadas, F. J.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 41 (02) : 253 - 286
  • [24] New speech/music discrimination approach based on fundamental frequency estimation
    N. Ruiz-Reyes
    P. Vera-Candeas
    J. E. Muñoz
    S. García-Galán
    F. J. Cañadas
    [J]. Multimedia Tools and Applications, 2009, 41 : 253 - 286
  • [25] Speech and Singing Discrimination for Audio Data Indexing
    Tsai, Wei-Ho
    Ma, Cin-Hao
    [J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 276 - 280
  • [26] A dynamic programming framework for neural network-based automatic speech segmentation
    van Vuuren, Van Zyl
    ten Bosch, Louis
    Niesler, Thomas
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2286 - 2290
  • [27] Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination
    Tsipas, Nikolaos
    Vrysis, Lazaros
    Dimoulas, Charalampos
    Papanikolaou, George
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (24) : 25603 - 25621
  • [28] Video assisted segmentation of speech and audio track
    Pandit, M
    Yusoff, Y
    Kittler, J
    Christmas, WJ
    Chilton, EHS
    [J]. MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS IV, 1999, 3846 : 68 - 77
  • [29] Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination
    Nikolaos Tsipas
    Lazaros Vrysis
    Charalampos Dimoulas
    George Papanikolaou
    [J]. Multimedia Tools and Applications, 2017, 76 : 25603 - 25621
  • [30] Speech/music discrimination for robust speech recognition in robots
    Choi, Mu Yeol
    Song, Hwa Jeon
    Kim, Hyung Soon
    [J]. 2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 118 - +