Automatic Speech Recognition for Supporting Endangered Language Documentation

被引:0
|
作者
Prud'hommeaux, Emily [1 ]
Jimerson, Robbie [2 ]
Hatcher, Richard [3 ]
Michelson, Karin [3 ]
机构
[1] Boston Coll, Chestnut Hill, MA 02167 USA
[2] Rochester Inst Technol, Rochester, NY USA
[3] Univ Buffalo, Buffalo, NY USA
来源
基金
美国国家科学基金会;
关键词
UNDER-RESOURCED LANGUAGES; NEURAL-NETWORKS; TRANSCRIPTION; ALIGNMENT; ASR;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.
引用
收藏
页码:491 / 513
页数:23
相关论文
共 50 条
  • [1] Enhancing Documentation of Hupa with Automatic Speech Recognition
    Liu, Zoey
    Spence, Justin
    Prud, Emily
    [J]. PROCEEDINGS OF THE FIFTH WORKSHOP ON THE USE OF COMPUTATIONAL METHODS IN THE STUDY OF ENDANGERED LANGUAGES (COMPUTEL-5 2022), 2022, : 187 - 192
  • [2] Automatic Speech Recognition and Query By Example for Creole Languages Documentation
    Macaire, Cecile
    Schwab, Didier
    Lecouteux, Benjamin
    Schang, Emmanuel
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2512 - 2520
  • [3] Development of Automatic Speech Recognition for the Documentation of Cook Islands Maori
    Coto-Solano, Rolando
    Nicholas, Sally Akevai
    Datta, Samiha
    Quint, Victoria
    Wills, Piripi
    Powell, Emma Ngakuravaru
    Koka'ua, Liam
    Tanveer, Syed
    Feldman, Isaac
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3872 - 3882
  • [4] PRELIMINARIES TO AUTOMATIC RECOGNITION OF SPEECH - LANGUAGE IDENTIFICATION
    HOUSE, AS
    NEUBERG, EP
    WOHLFORD, RE
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1975, 57 : S34 - S34
  • [5] LANGUAGE MODEL VERBALIZATION FOR AUTOMATIC SPEECH RECOGNITION
    Sak, Hasim
    Beaufays, Francoise
    Nakajima, Kaisuke
    Allauzen, Cyril
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8262 - 8266
  • [6] GEOGRAPHIC LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Xiao, Xiaoqiang
    Chen, Hong
    Zylak, Mark
    Sosa, Daniela
    Desu, Suma
    Krishnamoorthy, Mahesh
    Liu, Daben
    Paulik, Matthias
    Zhang, Yuchen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6124 - 6128
  • [7] Automatic emotional speech recognition in Serbian language
    Bojanic, Milana
    Delic, Vlado
    [J]. 2013 21ST TELECOMMUNICATIONS FORUM (TELFOR), 2013, : 459 - 465
  • [8] Endangered Language Documentation: Bootstrapping a Chatino Speech Corpus, Forced Aligner, ASR
    Cavar, Malgorzata E.
    Cavar, Damir
    Cruz, Hilaria
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4004 - 4011
  • [9] Applications of automatic speech recognition to speech and language development in young children
    Russell, M
    Brown, C
    Skilling, A
    Series, R
    Wallace, J
    Bonham, B
    Barker, P
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 176 - 179