Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus

被引:0
|
作者
Macho, D [1 ]
Padrell, J [1 ]
Abad, A [1 ]
Nadeu, C [1 ]
Hernando, J [1 ]
McDonough, J [1 ]
Wölfel, M [1 ]
Klee, W [1 ]
Omologo, M [1 ]
Brutti, A [1 ]
Svaizer, P [1 ]
Potamianos, G [1 ]
Chu, SM [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To realize the long-term goal of ubiquitous Computing, technological advances in multi-channel acoustic analysis are needed in order to solve several basic problems, including speaker localization and tracking, speech activity detection (SAD) and distant-talking automatic speech recognition (ASR). The European Commission integrated project CHIL, "Computers in the Human Interaction Loop", aims to make significant advances in these three technologies. In this work, we report the results of our initial automatic source localization, speech activity detection, and speech recognition experiments on the CHIL seminar corpus, Which is comprised of spontaneous speech collected by both near- and far-field microphones. In addition to the audio sensors, the seminars were also recorded by calibrated video cameras. This simultaneous audio-visual data capture enables the realistic evaluation of component technologies as was never possible with earlier data bases.
引用
收藏
页码:877 / 880
页数:4
相关论文
共 50 条
  • [21] RODIGITS - A ROMANIAN CONNECTED-DIGITS SPEECH CORPUS FOR AUTOMATIC SPEECH AND SPEAKER RECOGNITION
    Georgescu, Alexandru Lucian
    Caranica, Alexandru
    Cucu, Horia
    Burileanu, Corneliu
    [J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2018, 80 (03): : 45 - 62
  • [22] ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION
    Droua-Hamdani, Ghania
    Selouani, Sid Ahmed
    Boudraa, Malika
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C): : 157 - 166
  • [23] Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
    Sivasankaran, Sunit
    Vincent, Emmanuel
    Fohr, Dominique
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 346 - 350
  • [24] Unsupervised speech/non-speech detection for automatic speech recognition in meeting rooms
    Maganti, Hari Krishna
    Motlicek, Petr
    Gatica-Perez, Daniel
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1037 - +
  • [25] Automatic detection of a prosodic hierarchy in a journalistic speech corpus
    Gendrot, Cedric
    Gerdes, Kim
    Adda-Decker, Martine
    [J]. LANGUE FRANCAISE, 2016, (191): : 123 - +
  • [26] A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline
    Khassanov, Yerbolat
    Mussakhojayeva, Saida
    Mirzakhmetov, Almas
    Adiyev, Alen
    Nurpeiissov, Mukhamet
    Varol, Huseyin Atakan
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 697 - 706
  • [27] Corpus Construction for Deaf Speakers and Analysis by Automatic Speech Recognition
    Kobayashi, Akio
    Yasu, Keiichi
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 2294 - 2298
  • [28] TED-LIUM: an Automatic Speech Recognition dedicated corpus
    Rousseau, Anthony
    Deleglise, Paul
    Esteve, Yannick
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 125 - 129
  • [29] An audio-visual corpus for multimodal automatic speech recognition
    Andrzej Czyzewski
    Bozena Kostek
    Piotr Bratoszewski
    Jozef Kotus
    Marcin Szykulski
    [J]. Journal of Intelligent Information Systems, 2017, 49 : 167 - 192
  • [30] An audio-visual corpus for multimodal automatic speech recognition
    Czyzewski, Andrzej
    Kostek, Bozena
    Bratoszewski, Piotr
    Kotus, Jozef
    Szykulski, Marcin
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2017, 49 (02) : 167 - 192