A Free Synthetic Corpus for Speaker Diarization Research

被引:5
|
作者
Edwards, Erik [1 ]
Brenndoerfer, Michael [2 ]
Robinson, Amanda [1 ]
Sadoughi, Najmeh [1 ]
Finley, Greg P. [1 ]
Korenevsky, Maxim [1 ]
Axtmann, Nico [3 ]
Miller, Mark [1 ]
Suendermann-Oeft, David [1 ]
机构
[1] EMR AI Inc, San Francisco, CA 94105 USA
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
[3] DHBW, Karlsruhe, Germany
来源
关键词
Speaker diarization; Speech activity detection; Open-source corpora; NOISE;
D O I
10.1007/978-3-319-99579-3_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A synthetic corpus of dialogs was constructed from the Libri-Speech corpus, and is made freely available for diarization research. It includes over 90 h of training data, and over 9 h each of development and test data. Both 2-person and 3-person dialogs, with and without overlap, are included. Timing information is provided in several formats, and includes not only speaker segmentations, but also phoneme segmentations. As such, it is a useful starting point for general, particularly early-stage, diarization system development.
引用
收藏
页码:113 / 122
页数:10
相关论文
共 50 条
  • [1] Speaker Diarization: A Review of Recent Research
    Anguera Miro, Xavier
    Bozonnet, Simon
    Evans, Nicholas
    Fredouille, Corinne
    Friedland, Gerald
    Vinyals, Oriol
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 356 - 370
  • [2] SPEAKER DIARIZATION WITH LSTM
    Wang, Quan
    Downey, Carlton
    Wan, Li
    Mansfield, Philip Andrew
    Moreno, Ignacio Lopez
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5239 - 5243
  • [3] Multimodal Speaker Diarization
    Noulas, Athanasios
    Englebienne, Gwenn
    Krose, Ben J. A.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 79 - 93
  • [4] SPEAKER DIARIZATION THROUGH SPEAKER EMBEDDINGS
    Rouvier, Mickael
    Bousquet, Pierre-Michel
    Favre, Benoit
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2082 - 2086
  • [5] Trainable Speaker Diarization
    Aronowitz, Hagai
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2021 - 2024
  • [6] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [7] New Advances in Speaker Diarization
    Aronowitz, Hagai
    Zhu, Weizhong
    Suzuki, Masayuki
    Kurata, Gakuto
    Hoory, Ron
    [J]. INTERSPEECH 2020, 2020, : 279 - 283
  • [8] WHERE ARE THE CHALLENGES IN SPEAKER DIARIZATION?
    Sinclair, Mark
    King, Simon
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7741 - 7745
  • [9] An Improved Speaker Diarization System
    Fu, Rong
    Benest, Ian D.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1253 - 1256
  • [10] SPEAKER DIARIZATION IN MEETING AUDIO
    Nwe, Tin Lay
    Sun, Hanwu
    Li, Haizhou
    Rahardja, Susanto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4073 - 4076