Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus

被引：0

作者：

Macho, D ^{[1
]}

Padrell, J ^{[1
]}

Abad, A ^{[1
]}

Nadeu, C ^{[1
]}

Hernando, J ^{[1
]}

McDonough, J ^{[1
]}

Wölfel, M ^{[1
]}

Klee, W ^{[1
]}

Omologo, M ^{[1
]}

Brutti, A ^{[1
]}

Svaizer, P ^{[1
]}

Potamianos, G ^{[1
]}

Chu, SM ^{[1
]}

机构：

[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain

来源：

2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2 | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To realize the long-term goal of ubiquitous Computing, technological advances in multi-channel acoustic analysis are needed in order to solve several basic problems, including speaker localization and tracking, speech activity detection (SAD) and distant-talking automatic speech recognition (ASR). The European Commission integrated project CHIL, "Computers in the Human Interaction Loop", aims to make significant advances in these three technologies. In this work, we report the results of our initial automatic source localization, speech activity detection, and speech recognition experiments on the CHIL seminar corpus, Which is comprised of spontaneous speech collected by both near- and far-field microphones. In addition to the audio sensors, the seminars were also recorded by calibrated video cameras. This simultaneous audio-visual data capture enables the realistic evaluation of component technologies as was never possible with earlier data bases.

引用

页码：877 / 880

页数：4

共 50 条

[31] An audio-visual corpus for multimodal automatic speech recognition
Czyzewski, Andrzej
Kostek, Bozena
Bratoszewski, Piotr
Kotus, Jozef
Szykulski, Marcin
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192
[32] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
Paccotacya-Yanque, Rosa Y. G.
Huanca-Anquise, Candy A.
Escalante-Calcina, Judith
Ramos-Lovon, Wilber R.
Cuno-Parari, Alvaro E.
SCIENTIFIC DATA, 2022, 9 (01)
[33] A speech corpus of Quechua Collao for automatic dimensional emotion recognition
Rosa Y. G. Paccotacya-Yanque
Candy A. Huanca-Anquise
Judith Escalante-Calcina
Wilber R. Ramos-Lovón
Álvaro E. Cuno-Parari
Scientific Data, 9
[34] Speech production and automatic speech recognition
Acoustics Bulletin, 2000, 25 (02):
[35] AUTOMATIC SPEECH RECOGNITION OF IMPAIRED SPEECH
CARLSON, GS
BERNSTEIN, J
INTERNATIONAL JOURNAL OF REHABILITATION RESEARCH, 1988, 11 (04) : 396 - 398
[36] A Study on Detection Based Automatic Speech Recognition
Ma, Chengyuan
Tsao, Yu
Lee, Chin-Hui
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2350 - 2353
[37] Detection of confusable words in automatic speech recognition
Anguita, J
Hernando, J
Peillon, S
Bramoullé, A
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (08) : 585 - 588
[38] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
Abushariah, Mohammad Abd-Alrahman Mahmoud
Ainon, Raja Noor
Zainuddin, Roziati
Alqudah, Assal Ali Mustafa
Ahmed, Moustafa Elshafei
Khalifa, Othman Omran
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
[39] An open and free Speech Corpus for Speaker Recognition: The FSCSR Speech Corpus
Bouziane, Ayoub
Kadi, Houda
Hourri, Soufiane
Kharroubi, Jamal
2016 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2016,
[40] Speech Activity Detection Based on Multilingual Speech Recognition System
Sarfjoo, Seyyed Saeed
Madikeri, Srikanth
Motlicek, Petr
INTERSPEECH 2021, 2021, : 4369 - 4373

← 1 2 3 4 5 →