A dynamic programming approach to audio segmentation and speech/music discrimination

被引：0

作者：

Goodwin, MM ^{[1
]}

Laroche, J ^{[1
]}

机构：

[1] Creat Adv Technol Ctr, Scotts Valley, CA USA

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: AUDIO AND ELECTROACOUSTICS SIGNAL PROCESSING FOR COMMUNICATIONS | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We consider the problem of segmenting an audio signal into characteristic regions based on feature-set similarities. In the proposed approach, a feature-space representation of the signal is generated; sequences of these feature-space samples are then aggregated into clusters corresponding to distinct signal regions. The algorithm consists of using linear discriminant analysis (LDA) to condition the feature space and dynamic programming (DP) to identify data clusters. In this paper, we consider the design of the dynamic program cost functions; we are able to derive effective cost functions without relying on significant prior information about the structure of the expected data clusters. We demonstrate the application of the LDA-DP segmentation algorithm to speech/music discrimination; experimental results are given and discussed.

引用

页码：309 / 312

页数：4

共 50 条

[21] New speech/music discrimination approach based on warping transformation and ANFIS
Munoz-Exposito, J. E.
Ruiz-Reyes, N.
Garcia-Galan, S.
Vera-Candeas, P.
[J]. JOURNAL OF NEW MUSIC RESEARCH, 2006, 35 (03) : 237 - 247
[22] ARTIFICIALLY SYNTHESISING DATA FOR AUDIO CLASSIFICATION AND SEGMENTATION TO IMPROVE SPEECH AND MUSIC DETECTION IN RADIO BROADCAST
Venkatesh, Satvik
Moffat, David
Kirke, Alexis
Shakeri, Gozel
Brewster, Stephen
Fachner, Jorg
Odell-Miller, Helen
Street, Alex
Farina, Nicolas
Banerjee, Sube
Miranda, Eduardo Reck
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 636 - 640
[23] New speech/music discrimination approach based on fundamental frequency estimation
Ruiz-Reyes, N.
Vera-Candeas, P.
Munoz, J. E.
Garcia-Galan, S.
Canadas, F. J.
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 41 (02) : 253 - 286
[24] New speech/music discrimination approach based on fundamental frequency estimation
N. Ruiz-Reyes
P. Vera-Candeas
J. E. Muñoz
S. García-Galán
F. J. Cañadas
[J]. Multimedia Tools and Applications, 2009, 41 : 253 - 286
[25] Speech and Singing Discrimination for Audio Data Indexing
Tsai, Wei-Ho
Ma, Cin-Hao
[J]. 2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 276 - 280
[26] A dynamic programming framework for neural network-based automatic speech segmentation
van Vuuren, Van Zyl
ten Bosch, Louis
Niesler, Thomas
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2286 - 2290
[27] Efficient audio-driven multimedia indexing through similarity-based speech/music discrimination
Tsipas, Nikolaos
Vrysis, Lazaros
Dimoulas, Charalampos
Papanikolaou, George
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (24) : 25603 - 25621
[28] Video assisted segmentation of speech and audio track
Pandit, M
Yusoff, Y
Kittler, J
Christmas, WJ
Chilton, EHS
[J]. MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS IV, 1999, 3846 : 68 - 77
[29] Efficient audio-driven multimedia indexing through similarity-based speech / music discrimination
Nikolaos Tsipas
Lazaros Vrysis
Charalampos Dimoulas
George Papanikolaou
[J]. Multimedia Tools and Applications, 2017, 76 : 25603 - 25621
[30] Speech/music discrimination for robust speech recognition in robots
Choi, Mu Yeol
Song, Hwa Jeon
Kim, Hyung Soon
[J]. 2007 RO-MAN: 16TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, VOLS 1-3, 2007, : 118 - +

← 1 2 3 4 5 →