Design of an efficient music-speech discriminator

被引：4

作者：

Tardon, Lorenzo J. ^{[1
]}

Sammartino, Simone ^{[1
]}

Barbancho, Isabel ^{[1
]}

机构：

[1] Univ Malaga, ETS Ingn Telecomunicac, Dept Ingn Comunicac, E-29071 Malaga, Spain

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2010年 / 127卷 / 01期

关键词：

CLASSIFICATION; RECOGNITION;

D O I：

10.1121/1.3257204

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, the problem of the design of a simple and efficient music-speech discriminator for large audio data sets in which advanced music playing techniques are taught and voice and music are intrinsically interleaved is addressed. In the process, a number of features used in speech-music discrimination are defined and evaluated over the available data set. Specifically, the data set contains pieces of classical music played with different and unspecified instruments (or even lyrics) and the voice of a teacher (a top music performer) or even the overlapped voice of the translator and other persons. After an initial test of the performance of the features implemented, a selection process is started, which takes into account the type of classifier selected beforehand, to achieve good discrimination performance and computational efficiency, as shown in the experiments. The discrimination application has been defined and tested on a large data set supplied by Fundacion Albeniz, containing a large variety of classical music pieces played with different instrument, which include comments and speeches of famous performers. (C) 2010 Acoustical Society of America. [DOI: 10.1121/1.3257204]

引用

页码：271 / 279

页数：9

共 50 条

[1] MUSIC MODELS FOR MUSIC-SPEECH SEPARATION
Hughes, Thad
Kristjansson, Trausti
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4917 - 4920
[2] MUSIC MODELS FOR MUSIC-SPEECH SEPARATION
Hughes, Thad
Kristjansson, Trausti
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4917 - 4920
[3] A robust and computationally efficient Speech/Music discriminator
Jayme, Garcia Arnal Barbedo
Lopes, Amauri
[J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2006, 54 (7-8): : 571 - 588
[4] Robust singing detection in speech/music discriminator design
Chou, W
Gu, L
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 865 - 868
[5] LEVERAGING STRUCTURAL INFORMATION IN MUSIC-SPEECH DECTECTION
Han, Jinyu
Coover, Bob
[J]. ELECTRONIC PROCEEDINGS OF THE 2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (ICMEW), 2013,
[6] Speech Segregation based on Pitch Track Correction and Music-Speech Classification
Kim, Han-Gyu
Jang, Gil-Jin
Park, Jeong-Sik
Kim, Ji-Hwan
Oh, Yung-Hwan
[J]. ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2012, 12 (02) : 15 - 20
[7] Music Component Characterization in the Music-Speech Mixture for Female Singing Tracks
Sharma, Shivam
Mittal, Vinay Kumar
[J]. 2017 2ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND NETWORKS (TEL-NET), 2017, : 126 - 132
[8] Random fourier feature based music-speech classification
Vyshnav, M. T.
Kumar, S. Sachin
Mohan, Neethu
Soman, K. P.
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 6353 - 6363
[9] Mixed wideband speech and music coding using a speech/music discriminator
Qiao, RY
[J]. IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 605 - 608
[10] Construction and evaluation of a robust multifeature speech/music discriminator
Scheirer, E
Slaney, M
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1331 - 1334

← 1 2 3 4 5 →