KALAKA: a TV Broadcast Speech Database for the Evaluation of Language Recognition Systems

被引:0
|
作者
Rodriguez-Fuentes, Luis J. [1 ]
Penagarikano, Mikel [1 ]
Bordel, German [1 ]
Varona, Amparo [1 ]
Diez, Mireia [1 ]
机构
[1] Univ Basque Country, Dept Elect & Elect, Software Technol Working Grp, Leioa 48940, Spain
关键词
SUPPORT VECTOR MACHINES; SPEAKER;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
A speech database, named KALAKA, was created to support the Albayzin 2008 Evaluation of Language Recognition Systems, organized by the Spanish Network on Speech Technologies from May to November 2008. This evaluation, designed according to the criteria and methodology applied in the NIST Language Recognition Evaluations, involved four target languages: Basque, Catalan, Galician and Spanish (official languages in Spain), and included speech signals in other (unknown) languages to allow open-set verification trials. In this paper, the process of designing, collecting data and building the train, development and evaluation datasets of KALAKA is described. Results attained in the Albayzin 2008 LRE are presented as a means of evaluating the database. The performance of a state-of-the-art language recognition system on a closed-set evaluation task is also presented for reference. Future work includes extending KALAKA by adding Portuguese and English as target languages and renewing the set of unknown languages needed to carry out open-set evaluations.
引用
收藏
页码:1678 / 1685
页数:8
相关论文
共 50 条
  • [31] Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition
    Roux, Thibault Baneras
    Rouvier, Mickael
    Wottawa, Jane
    Dufour, Richard
    INTERSPEECH 2022, 2022, : 3968 - 3972
  • [32] A Comparative Study of Three Speech Recognition Systems for Romanian Language
    Schiopu, Daniela
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, ICVL 2010, 2010, : 318 - 324
  • [33] k-TSS language models in speech recognition systems
    Torres, I
    Varona, A
    COMPUTER SPEECH AND LANGUAGE, 2001, 15 (02): : 127 - 149
  • [34] A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
    Esmaileyan, Z.
    Marvi, H.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2014, 27 (01): : 79 - 89
  • [35] Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems
    Lanchantin, P.
    Gales, M. J. F.
    Karanasou, P.
    Liu, X.
    Qian, Y.
    Wang, L.
    Woodland, P. C.
    Zhang, C.
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3057 - 3061
  • [36] ALIF: A Dataset for Arabic Embedded Text Recognition in TV Broadcast
    Yousfi, Sonia
    Berrani, Sid-Ahmed
    Garcia, Christophe
    2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 1221 - 1225
  • [37] Topic extraction based on continuous speech recognition in broadcast news speech
    Ohtsuki, K
    Matsuoka, T
    Matsunaga, S
    Furui, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (07) : 1138 - 1144
  • [38] A new database for Turkish speech recognition on mobile devices and initial speech recognition results using the database
    Buyuk, Osman
    PAMUKKALE UNIVERSITY JOURNAL OF ENGINEERING SCIENCES-PAMUKKALE UNIVERSITESI MUHENDISLIK BILIMLERI DERGISI, 2018, 24 (02): : 180 - 184
  • [39] The ETAPE corpus for the evaluation of speech-based TV content processing in the French language
    Gravier, Guillaume
    Adda, Gilles
    Paulsson, Niklas
    Carre, Matthieu
    Giraudel, Aude
    Galibert, Olivier
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 114 - 118
  • [40] The EpiSLI Database: A Publicly Available Database on Speech and Language
    Tomblin, J. Bruce
    LANGUAGE SPEECH AND HEARING SERVICES IN SCHOOLS, 2010, 41 (01) : 108 - 117