Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus

被引：0

作者：

Macho, D ^{[1
]}

Padrell, J ^{[1
]}

Abad, A ^{[1
]}

Nadeu, C ^{[1
]}

Hernando, J ^{[1
]}

McDonough, J ^{[1
]}

Wölfel, M ^{[1
]}

Klee, W ^{[1
]}

Omologo, M ^{[1
]}

Brutti, A ^{[1
]}

Svaizer, P ^{[1
]}

Potamianos, G ^{[1
]}

Chu, SM ^{[1
]}

机构：

[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain

来源：

2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2 | 2005年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To realize the long-term goal of ubiquitous Computing, technological advances in multi-channel acoustic analysis are needed in order to solve several basic problems, including speaker localization and tracking, speech activity detection (SAD) and distant-talking automatic speech recognition (ASR). The European Commission integrated project CHIL, "Computers in the Human Interaction Loop", aims to make significant advances in these three technologies. In this work, we report the results of our initial automatic source localization, speech activity detection, and speech recognition experiments on the CHIL seminar corpus, Which is comprised of spontaneous speech collected by both near- and far-field microphones. In addition to the audio sensors, the seminars were also recorded by calibrated video cameras. This simultaneous audio-visual data capture enables the realistic evaluation of component technologies as was never possible with earlier data bases.

引用

页码：877 / 880

页数：4

共 50 条

[21] RODIGITS - A ROMANIAN CONNECTED-DIGITS SPEECH CORPUS FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
Georgescu, Alexandru Lucian
Caranica, Alexandru
Cucu, Horia
Burileanu, Corneliu
[J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2018, 80 (03): : 45 - 62
[22] ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION
Droua-Hamdani, Ghania
Selouani, Sid Ahmed
Boudraa, Malika
[J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C): : 157 - 166
[23] Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
Sivasankaran, Sunit
Vincent, Emmanuel
Fohr, Dominique
[J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 346 - 350
[24] Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms
Maganti, Hari Krishna
Motlicek, Petr
Gatica-Perez, Daniel
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1037 - +
[25] Automatic detection of a prosodic hierarchy in a journalistic speech corpus
Gendrot, Cedric
Gerdes, Kim
Adda-Decker, Martine
[J]. LANGUE FRANCAISE, 2016, (191): : 123 - +
[26] A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
Khassanov, Yerbolat
Mussakhojayeva, Saida
Mirzakhmetov, Almas
Adiyev, Alen
Nurpeiissov, Mukhamet
Varol, Huseyin Atakan
[J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 697 - 706
[27] Corpus Construction for Deaf Speakers and Analysis by Automatic Speech Recognition
Kobayashi, Akio
Yasu, Keiichi
[J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2294 - 2298
[28] TED-LIUM: an Automatic Speech Recognition dedicated corpus
Rousseau, Anthony
Deleglise, Paul
Esteve, Yannick
[J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 125 - 129
[29] An audio-visual corpus for multimodal automatic speech recognition
Andrzej Czyzewski
Bozena Kostek
Piotr Bratoszewski
Jozef Kotus
Marcin Szykulski
[J]. Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
[30] An audio-visual corpus for multimodal automatic speech recognition
Czyzewski, Andrzej
Kostek, Bozena
Bratoszewski, Piotr
Kotus, Jozef
Szykulski, Marcin
[J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192

← 1 2 3 4 5 →