A dynamic programming approach to audio segmentation and speech/music discrimination

被引：0

作者：

Goodwin, MM ^{[1
]}

Laroche, J ^{[1
]}

机构：

[1] Creat Adv Technol Ctr, Scotts Valley, CA USA

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: AUDIO AND ELECTROACOUSTICS SIGNAL PROCESSING FOR COMMUNICATIONS | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We consider the problem of segmenting an audio signal into characteristic regions based on feature-set similarities. In the proposed approach, a feature-space representation of the signal is generated; sequences of these feature-space samples are then aggregated into clusters corresponding to distinct signal regions. The algorithm consists of using linear discriminant analysis (LDA) to condition the feature space and dynamic programming (DP) to identify data clusters. In this paper, we consider the design of the dynamic program cost functions; we are able to derive effective cost functions without relying on significant prior information about the structure of the expected data clusters. We demonstrate the application of the LDA-DP segmentation algorithm to speech/music discrimination; experimental results are given and discussed.

引用

页码：309 / 312

页数：4

共 50 条

[1] Speech/Music Discrimination in Audio Podcast Using Structural Segmentation and Timbre Recognition
Barthet, Mathieu
Hargreaves, Steven
Sandler, Mark
[J]. EXPLORING MUSIC CONTENTS, 2011, 6684 : 138 - 162
[2] Audio coding improvement using evolutionary speech/music discrimination
Exposito, J. E. Munoz
Galan, S. Garcia
Reyes, N. Ruiz
Candeas, R. Vera
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-4, 2007, : 822 - 827
[3] A fast and robust speech/music discrimination approach
Wang, WQ
Gao, W
Ying, DW
[J]. ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1325 - 1329
[4] Integration of Speech/Music Discrimination and Mood Classification with Audio Feature Extraction
Ashraf, Mohsin
Geng Guohua
Wang, Xiaofeng
Ahmad, Farooq
[J]. 2018 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2018), 2018, : 224 - 229
[5] Expert system for intelligent audio codification based in speech/music discrimination
Exposito, J. E. Munoz
Galan, S. Garcia
Reyes, N. Ruiz
Candeas, P. Vera
Pena, F. Rivas
[J]. 2006 INTERNATIONAL SYMPOSIUM ON EVOLVING FUZZY SYSTEMS, PROCEEDINGS, 2006, : 318 - +
[6] Simultaneous speech segmentation and phoneme recognition using dynamic programming
Bajwa, RS
Owens, RM
Kelliher, TP
[J]. 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 3213 - 3216
[7] Speech/music discrimination-based audio characterization using blind watermarking scheme
Mezghani, Eya
Charfeddine, Maha
Nicolas, Henri
Ben Amar, Chokri
[J]. JOURNAL OF INFORMATION ASSURANCE AND SECURITY, 2016, 11 (06): : 311 - 321
[8] SPEECH/MUSIC DISCRIMINATION BASED ON WARPING TRANSFORMATION AND FUZZY LOGIC FOR INTELLIGENT AUDIO CODING
Enrique Munoz-Exposito, Jose
Garcia Galan, Sebastian
Ruiz Reyes, Nicolas
Vera Candeas, Pedro
[J]. APPLIED ARTIFICIAL INTELLIGENCE, 2009, 23 (05) : 427 - 442
[9] An RNN-Based Speech-Music Discrimination Used for Hybrid Audio Coder
Yang, Wanzhao
Tu, Weiping
Zheng, Jiaxi
Zhang, Xiong
Yang, Yuhong
Song, Yucheng
[J]. MULTIMEDIA MODELING, MMM 2018, PT I, 2018, 10704 : 81 - 92
[10] Combined speech and audio coding by discrimination
Tancerel, L
Ragot, S
Ruoppila, VT
Lefebvre, R
[J]. 2000 IEEE WORKSHOP ON SPEECH CODING, PROCEEDINGS: MEETING THE CHALLENGES OF THE NEW MILLENNIUM, 2000, : 154 - 156

← 1 2 3 4 5 →