IMPROVING MULTIPLE-CROWD-SOURCED TRANSCRIPTIONS USING A SPEECH RECOGNISER

被引:0
|
作者
van Dalen, R. C. [1 ]
Knill, K. M. [1 ]
Tsiakoulis, P. [1 ]
Gales, M. J. F. [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge CB2 1PZ, England
关键词
Automatic speech recognition; crowd-sourcing; transcription combination;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces a method to produce high-quality transcriptions of speech data from only two crowd-sourced transcriptions. These transcriptions, produced cheaply by people on the Internet, for example through Amazon Mechanical Turk, are often of low quality. Often, multiple crowd-sourced transcriptions are combined to form one transcription of higher quality. However, the state of the art is to use essentially a form of majority voting, which requires at least three transcriptions for each utterance. This paper shows how to refine this approach to work with only two transcriptions. It then introduces a method that uses a speech recogniser (bootstrapped on a simple combination scheme) to combine transcriptions. When only two crowd-sourced transcriptions are available, on a noisy data set this improves the word error rate to gold-standard transcriptions by 21% relative.
引用
下载
收藏
页码:4709 / 4713
页数:5
相关论文
共 50 条
  • [1] ANALYZING QUALITY OF CROWD-SOURCED SPEECH TRANSCRIPTIONS OF NOISY AUDIO FOR ACOUSTIC MODEL ADAPTATION
    Audhkhasi, Kartik
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4137 - 4140
  • [2] Reliability-Weighted Acoustic Model Adaptation Using Crowd Sourced Transcriptions
    Audhkhasi, Kartik
    Georgiou, Panayiotis G.
    Narayanan, Shrikanth S.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3052 - 3055
  • [3] Improving retrieval on imperfect speech transcriptions
    Jourlin, P
    Johnson, SE
    Jones, KS
    Woodland, PC
    SIGIR'99: PROCEEDINGS OF 22ND INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 1999, : 283 - 284
  • [4] On the Impact of Noises in Crowd-Sourced Data for Speech Translation
    Ouyang, Siqi
    Ye, Rong
    Li, Lei
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 92 - 97
  • [5] Using Crowd-Sourced Speech Data to Study Socially Constrained Variation in Nonmodal Phonation
    Gittelson, Ben
    Leemann, Adrian
    Tomaschek, Fabian
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2021, 3
  • [6] Language Accent Detection with CNN Using Sparse Data from a Crowd-Sourced Speech Archive
    Mikhailava, Veranika
    Lesnichaia, Mariia
    Bogach, Natalia
    Lezhenin, Iurii
    Blake, John
    Pyshkin, Evgeny
    MATHEMATICS, 2022, 10 (16)
  • [7] Acquiring Speech Transcriptions Using Mismatched Crowdsourcing
    Jyothi, Preethi
    Hasegawa-Johnson, Mark
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 1263 - 1269
  • [8] Crowd-Sourced, Automatic Speech-Corpora Collection - Building the Romanian Anonymous Speech Corpus
    Dumitrescu, Stefan Daniel
    Boros, Tiberiu
    Ion, Radu
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [9] PrivacyScore: Improving Privacy and Security via Crowd-Sourced Benchmarks of Websites
    Maass, Max
    Wichmann, Pascal
    Pridoehl, Henning
    Herrmann, Dominik
    PRIVACY TECHNOLOGIES AND POLICY, APF 2017, 2017, 10518 : 178 - 191
  • [10] Towards improving peer review: Crowd-sourced insights from Twitter
    Allen, Kelly-Ann
    Reardon, Jonathan
    Lu, Yumin
    Smith, David V.
    Rainsford, Emily
    JOURNAL OF UNIVERSITY TEACHING AND LEARNING PRACTICE, 2022, 19 (03):