Segmentation and Disfluency Removal for Conversational Speech Translation

被引:0
|
作者
Hassan, Hany [1 ]
Schwartz, Lee [1 ]
Hakkani-Tur, Dilek [1 ]
Tur, Gokhan [1 ]
机构
[1] Microsoft Res, Cambridge, England
关键词
speech translation; disfluency removal; segmentation; sentence units; speech processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we focus on the effect of on-line speech segmentation and disfluency removal methods on conversational speech translation. In a real-time conversational speech to speech translation system, on-line segmentation of speech is required to avoid latency beyond few seconds. While sentential unit segmentation and disfluency removal have been heavily studied mainly for off-line speech processing, to the best of our knowledge, the combined effect of these tasks on conversational speech translation has not been investigated. Furthermore, optimization of performance given maximum allowable system latency to enable a conversation is a newer problem for these tasks. We show that the conventional assumption of doing segmentation followed by disfluency removal is not the best practice. We propose a new approach to do simple-disfluency removal followed by segmentation and then by complex-disfluency removal. The proposed approach shows a significant gain on translation performance of up to 3 Bleu points with only 6 second latency to look ahead, using state-of the art machine translation and speech recognition systems.
引用
收藏
页码:318 / 322
页数:5
相关论文
共 50 条
  • [41] ADAPTATION EFFECT FOR 6 TYPES OF SPEECH DISFLUENCY
    SILVERMAN, FH
    WILLIAMS, DE
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1971, 14 (03): : 525 - +
  • [42] STREAMING JOINT SPEECH RECOGNITION AND DISFLUENCY DETECTION
    Futami, Hayato
    Tsunoo, Emiru
    Shibata, Kentaro
    Kashiwagi, Yosuke
    Okuda, Takao
    Arora, Siddhant
    Watanabe, Shinji
    [J]. arXiv, 2022,
  • [43] Adaptation of the humanoid robot to speech disfluency therapy
    Kwasniewicz, Lukasz
    Kuniszyk-Jozkowiak, Wieslawa
    Wojcik, Grzegorz M.
    Masiak, Jolanta
    [J]. BIO-ALGORITHMS AND MED-SYSTEMS, 2016, 12 (04) : 169 - 177
  • [44] A Hybrid Deep Ensemble for Speech Disfluency Classification
    Pravin, Sheena Christabel
    Palanivelan, M.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2021, 40 (08) : 3968 - 3995
  • [45] INFLUENCE OF PRESCHOOLERS SPEECH USAGE ON THEIR DISFLUENCY FREQUENCY
    SILVERMAN, EM
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1973, 16 (03): : 474 - 481
  • [46] A Hybrid Deep Ensemble for Speech Disfluency Classification
    Sheena Christabel Pravin
    M. Palanivelan
    [J]. Circuits, Systems, and Signal Processing, 2021, 40 : 3968 - 3995
  • [47] Joint prediction of punctuation and disfluency in speech transcripts
    Lin, Binghuai
    Wang, Liyuan
    [J]. INTERSPEECH 2020, 2020, : 716 - 720
  • [48] Early childhood speech disfluency - a case report
    Dunaj, Jolanta
    Tarkowski, Zbigniew
    [J]. PSYCHIATRIA I PSYCHOLOGIA KLINICZNA-JOURNAL OF PSYCHIATRY AND CLINICAL PSYCHOLOGY, 2011, 11 (01): : 55 - 58
  • [49] Automatic Disfluency Detection From Untranscribed Speech
    Romana, Amrit
    Koishida, Kazuhito
    Provost, Emily Mower
    [J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2024, 32 : 4727 - 4740
  • [50] Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
    Fukuda, Ryo
    Sudoh, Katsuhito
    Nakamura, Satoshi
    [J]. INTERSPEECH 2022, 2022, : 121 - 125