Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus

被引:0
|
作者
Macho, D [1 ]
Padrell, J [1 ]
Abad, A [1 ]
Nadeu, C [1 ]
Hernando, J [1 ]
McDonough, J [1 ]
Wölfel, M [1 ]
Klee, W [1 ]
Omologo, M [1 ]
Brutti, A [1 ]
Svaizer, P [1 ]
Potamianos, G [1 ]
Chu, SM [1 ]
机构
[1] Univ Politecn Cataluna, TALP Res Ctr, Barcelona, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To realize the long-term goal of ubiquitous Computing, technological advances in multi-channel acoustic analysis are needed in order to solve several basic problems, including speaker localization and tracking, speech activity detection (SAD) and distant-talking automatic speech recognition (ASR). The European Commission integrated project CHIL, "Computers in the Human Interaction Loop", aims to make significant advances in these three technologies. In this work, we report the results of our initial automatic source localization, speech activity detection, and speech recognition experiments on the CHIL seminar corpus, Which is comprised of spontaneous speech collected by both near- and far-field microphones. In addition to the audio sensors, the seminars were also recorded by calibrated video cameras. This simultaneous audio-visual data capture enables the realistic evaluation of component technologies as was never possible with earlier data bases.
引用
收藏
页码:877 / 880
页数:4
相关论文
共 50 条
  • [1] Automatic speech recognition and speech activity detection in the CHIL smart room
    Chu, SM
    Marcheret, E
    Potamianos, G
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3869 : 332 - 343
  • [2] Corpus for automatic speech recognition
    Adda-Decker, Martine
    [J]. REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2007, 12 (01): : 71 - 84
  • [3] Creation of Marathi Speech Corpus for Automatic Speech Recognition
    Gaikwad, Santosh
    Gawali, Bharti
    Mehrotra, Suresh
    [J]. 2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [4] The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
    Mukiibi, Jonathan
    Katumba, Andrew
    Nakatumba-Nabende, Joyce
    Hussein, Ali
    Meyer, Josh
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1945 - 1954
  • [5] Bangladeshi Bangla speech corpus for automatic speech recognition research
    Kibria, Shafkat
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    [J]. Speech Communication, 2022, 136 : 84 - 97
  • [6] RSC: A Romanian Read Speech Corpus for Automatic Speech Recognition
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6606 - 6612
  • [7] Bangladeshi Bangla speech corpus for automatic speech recognition research
    Kibria, Shafkat
    Samin, Ahnaf Mozib
    Kobir, M. Humayon
    Rahman, M. Shahidur
    Selim, M. Reza
    Iqbal, M. Zafar
    [J]. SPEECH COMMUNICATION, 2022, 136 : 84 - 97
  • [8] Chhattisgarhi speech corpus for research and development in automatic speech recognition
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (02) : 193 - 210
  • [9] An automatic speech recognition system for spontaneous Punjabi speech corpus
    Kumar Y.
    Singh N.
    [J]. International Journal of Speech Technology, 2017, 20 (2) : 297 - 303
  • [10] KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition
    Bang, Jeong-Uk
    Yun, Seung
    Kim, Seung-Hi
    Choi, Mu-Yeol
    Lee, Min-Kyu
    Kim, Yeo-Jeong
    Kim, Dong-Hyun
    Park, Jun
    Lee, Young-Jik
    Kim, Sang-Hun
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 17