KALAKA: a TV Broadcast Speech Database for the Evaluation of Language Recognition Systems

被引：0

作者：

Rodriguez-Fuentes, Luis J. ^{[1
]}

Penagarikano, Mikel ^{[1
]}

Bordel, German ^{[1
]}

Varona, Amparo ^{[1
]}

Diez, Mireia ^{[1
]}

机构：

[1] Univ Basque Country, Dept Elect & Elect, Software Technol Working Grp, Leioa 48940, Spain

来源：

LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2010年

关键词：

SUPPORT VECTOR MACHINES; SPEAKER;

D O I：

暂无

中图分类号：

H [语言、文字];

学科分类号：

05 ;

摘要：

A speech database, named KALAKA, was created to support the Albayzin 2008 Evaluation of Language Recognition Systems, organized by the Spanish Network on Speech Technologies from May to November 2008. This evaluation, designed according to the criteria and methodology applied in the NIST Language Recognition Evaluations, involved four target languages: Basque, Catalan, Galician and Spanish (official languages in Spain), and included speech signals in other (unknown) languages to allow open-set verification trials. In this paper, the process of designing, collecting data and building the train, development and evaluation datasets of KALAKA is described. Results attained in the Albayzin 2008 LRE are presented as a means of evaluating the database. The performance of a state-of-the-art language recognition system on a closed-set evaluation task is also presented for reference. Future work includes extending KALAKA by adding Portuguese and English as target languages and renewing the set of unknown languages needed to carry out open-set evaluations.

引用

页码：1678 / 1685

页数：8

共 50 条

[1] KALAKA-2: a TV Broadcast Speech Database for the Recognition of Iberian Languages in Clean and Noisy Environments
Rodriguez-Fuentes, Luis J.
Penagarikano, Mikel
Varona, Amparo
Diez, Mireia
Bordel, German
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 99 - 105
[2] Spoken language recognition in conversational telephone speech and TV broadcast news (GLOSA)
Javier Rodriguez-Fuentes, Luis
Varona, Amparo
Penagarikano, Mikel
Diez, Mireia
Bordel, German
PROCESAMIENTO DEL LENGUAJE NATURAL, 2011, (47): : 349 - 350
[3] KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios
Luis Javier Rodríguez-Fuentes
Mikel Penagarikano
Amparo Varona
Mireia Diez
Germán Bordel
Language Resources and Evaluation, 2016, 50 : 221 - 243
[4] KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios
Javier Rodriguez-Fuentes, Luis
Penagarikano, Mikel
Varona, Amparo
Diez, Mireia
Bordel, German
LANGUAGE RESOURCES AND EVALUATION, 2016, 50 (02) : 221 - 243
[5] Development and Evaluation of Speech Database in Automotive Environments for Practical Speech Recognition Systems
Obuchi, Yasunari
Hataoka, Nobuo
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2314 - 2317
[6] Speech recognition of broadcast news for the European Portuguese language
Meinedo, H
Souto, N
Neto, JP
ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 319 - 322
[7] The Slovenian BNSI Broadcast News database for continuous speech recognition
Zgank, Andrej
Verdonik, Darinka
Kacic, Zdravko
ELEKTROTEHNISKI VESTNIK-ELECTROCHEMICAL REVIEW, 2008, 75 (03): : 85 - 90
[8] Towards mixed language speech recognition systems
Imseng, David
Bourlard, Herve
Magimai-Doss, Mathew
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 278 - 281
[9] Effectiveness of word string language models on noisy broadcast news speech recognition
Takagi, K
Oguro, R
Ozeki, K
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (07) : 1130 - 1137
[10] KALAKA-3: a database for the recognition of spoken European languages on YouTube audios
Javier Rodriguez-Fuentes, Luis
Penagarikano, Mikel
Varona, Amparo
Diez, Mireia
Bordel, German
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 443 - 449

← 1 2 3 4 5 →