Improving speech transcription for Mandarin-English translation

被引:0
|
作者
Tomalin, M. [1 ]
Gales, M. J. F. [1 ]
Liu, X. A. [1 ]
Sim, K. C. [1 ]
Sinha, R. [1 ]
Wang, L. [1 ]
Woodland, P. C. [1 ]
Yu, K. [1 ]
机构
[1] Univ Cambridge, Dept Engn, Trumpington St, Cambridge CB2 1PZ, England
关键词
speech recognition; sentence boundary detection; machine translation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the development of the CU-HTK Mandarin Speech-To-Text (STT) system and assesses its performance as part of a transcription-translation pipeline which converts broadcast Mandarin audio into English text. Recent improvements to the STT system are described and these give Character Error Rate (CER) gains of 14.3% absolute for a Broadcast Conversation (BC) task and 5.1% absolute for a Broadcast News (BN) task. The output of these STT systems is then post-processed, so that it consists of sentence-like segments, and translated into English text using a Statistical Machine Translation (SMT) system. The performance of the transcription-translation pipeline is evaluated using the Translation Edit Rate (TER) and BLEU metrics. It is shown that improving both the STT system and the post-STT segmentations can lower the TER scores by up to 5.3% absolute and increase the BLEU scores by up to 2.7% absolute.
引用
收藏
页码:97 / +
页数:2
相关论文
共 50 条
  • [1] Mandarin-English Code-switching Speech Recognition
    Xu, Haihua
    Van Tung Pham
    Kyaw, Zin Tun
    Lim, Zhi Hao
    Chng, Eng Siong
    Li, Haizhou
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 554 - 555
  • [2] Mandarin-English Information (MEI): investigating translingual speech retrieval
    Meng, HM
    Chen, B
    Khudanpur, S
    Levow, GA
    Lo, WK
    Oard, D
    Schone, P
    Tang, K
    Wang, HM
    Wang, JQ
    [J]. COMPUTER SPEECH AND LANGUAGE, 2004, 18 (02): : 163 - 179
  • [3] Towards Language-Universal Mandarin-English Speech Recognition
    Zhang, Shiliang
    Liu, Yuan
    Lei, Ming
    Ma, Bin
    Xie, Lei
    [J]. INTERSPEECH 2019, 2019, : 2170 - 2174
  • [4] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Yanhua Long
    Shuang Wei
    Jie Lian
    Yijie Li
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2021
  • [5] Pronunciation augmentation for Mandarin-English code-switching speech recognition
    Long, Yanhua
    Wei, Shuang
    Lian, Jie
    Li, Yijie
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2021, 2021 (01)
  • [6] A FIRST SPEECH RECOGNITION SYSTEM FOR MANDARIN-ENGLISH CODE-SWITCH CONVERSATIONAL SPEECH
    Ngoc Thang Vu
    Lyu, Dau-Cheng
    Weiner, Jochen
    Telaar, Dominic
    Schlippe, Tim
    Blaicher, Fabian
    Chng, Eng-Siong
    Schultz, Tanja
    Li, Haizhou
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4889 - 4892
  • [7] A FIRST SPEECH RECOGNITION SYSTEM FOR MANDARIN-ENGLISH CODE-SWITCH CONVERSATIONAL SPEECH
    Ngoc Thang Vu
    Lyu, Dau-Cheng
    Weiner, Jochen
    Telaar, Dominic
    Schlippe, Tim
    Blaicher, Fabian
    Chng, Eng-Siong
    Schultz, Tanja
    Li, Haizhou
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4889 - 4892
  • [8] Mandarin-English bilingual speech recognition for real world music retrieval
    Zhang, Qingqing
    Pan, Jielin
    Yan, Yonghong
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4253 - 4256
  • [9] Translation ambiguity in Mandarin-English bilinguals Translation production differences in concrete, abstract, and emotion words
    Basnight-Brown, Dana M.
    Kazanas, Stephanie A.
    Altarriba, Jeanette
    [J]. LINGUISTIC APPROACHES TO BILINGUALISM, 2020, 10 (04) : 559 - 586
  • [10] PHONE MODELING AND COMBINING DISCRIMINATIVE TRAINING FOR MANDARIN-ENGLISH BILINGUAL SPEECH RECOGNITION
    Qian, Yanmin
    Liu, Jia
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4918 - 4921