Segmentation and Disfluency Removal for Conversational Speech Translation

被引:0
|
作者
Hassan, Hany [1 ]
Schwartz, Lee [1 ]
Hakkani-Tur, Dilek [1 ]
Tur, Gokhan [1 ]
机构
[1] Microsoft Res, Cambridge, England
关键词
speech translation; disfluency removal; segmentation; sentence units; speech processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we focus on the effect of on-line speech segmentation and disfluency removal methods on conversational speech translation. In a real-time conversational speech to speech translation system, on-line segmentation of speech is required to avoid latency beyond few seconds. While sentential unit segmentation and disfluency removal have been heavily studied mainly for off-line speech processing, to the best of our knowledge, the combined effect of these tasks on conversational speech translation has not been investigated. Furthermore, optimization of performance given maximum allowable system latency to enable a conversation is a newer problem for these tasks. We show that the conventional assumption of doing segmentation followed by disfluency removal is not the best practice. We propose a new approach to do simple-disfluency removal followed by segmentation and then by complex-disfluency removal. The proposed approach shows a significant gain on translation performance of up to 3 Bleu points with only 6 second latency to look ahead, using state-of the art machine translation and speech recognition systems.
引用
收藏
页码:318 / 322
页数:5
相关论文
共 50 条
  • [1] Language and disfluency in nonstuttering children's conversational speech
    Yaruss, JS
    Newman, RM
    Flora, T
    [J]. JOURNAL OF FLUENCY DISORDERS, 1999, 24 (03) : 185 - 207
  • [2] Interactive translation of conversational speech
    Waibel, A
    [J]. COMPUTER, 1996, 29 (07) : 41 - &
  • [3] Automatic linguistic segmentation of conversational speech
    Stolcke, A
    Shriberg, E
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1005 - 1008
  • [4] Concept Segmentation and Labeling for Conversational Speech
    Dinarelli, Marco
    Moschitti, Alessandro
    Riccardi, Giuseppe
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2691 - 2694
  • [5] AUTOMATIC DISFLUENCY REMOVAL FOR IMPROVING SPOKEN LANGUAGE TRANSLATION
    Wang, Wen
    Tur, Gokhan
    Zheng, Jing
    Ayan, Necip Fazil
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5214 - 5217
  • [6] End-to-End Speech Recognition and Disfluency Removal
    Lou, Paria Jamshid
    Johnson, Mark
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2051 - 2061
  • [7] Speech Disfluency in School-Age Children's Conversational and Narrative Discourse
    Byrd, Courtney T.
    Logan, Kenneth J.
    Gillam, Ronald B.
    [J]. LANGUAGE SPEECH AND HEARING SERVICES IN SCHOOLS, 2012, 43 (02) : 153 - 163
  • [8] Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech
    Hough, Julian
    Schlangen, David
    [J]. 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 326 - 336
  • [9] Dialogue processing in a conversational speech translation system
    Lavie, A
    Levin, L
    Qu, Y
    Waibel, A
    Gates, D
    Gavalda, M
    Mayfield, L
    Taboada, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 554 - 557
  • [10] Translation of conversational speech with JANUS-II
    Lavie, A
    Waibel, A
    Levin, L
    Gates, D
    Gavalda, M
    Zeppenfeld, T
    Zhan, PM
    Glickman, O
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2375 - 2378