Input segmentation of spontaneous speech in JANUS: A speech-to-speech translation system

被引:0
|
作者
Lavie, A [1 ]
Gates, D [1 ]
Coccaro, N [1 ]
Levin, L [1 ]
机构
[1] Carnegie Mellon Univ, Ctr Machine Translat, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In this paper we describe how multi-level segmentation of single utterance turns improves translation quality and facilitates accurate translation in our system. We define the basic dialogue units that are handled by our system, and discuss the cues and methods employed by the system in segmenting the input utterance into such units. Utterance segmentation in our system is performed in a multi-level incremental fashion, partly prior and partly during analysis by the parser. The segmentation relies on a combination of acoustic, lexical, semantic and statistical knowledge sources, which are described in detail in the paper. We also discuss how our system is designed to disambiguate among alternative possible input segmentations.
引用
收藏
页码:86 / 99
页数:14
相关论文
共 50 条
  • [21] The Asian Network-based Speech-to-Speech Translation System
    Sakti, Sakriani
    Kimura, Noriyuki
    Paul, Michael
    Hori, Chiori
    Sumita, Eiichiro
    Nakamura, Satoshi
    Park, Jun
    Wutiwiwatchai, Chai
    Xu, Bo
    Riza, Hammam
    Arora, Karunesh
    Luong, Chi Mai
    Li, Haizhou
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
  • [22] VERBMOBIL: The evolution of a complex large speech-to-speech translation system
    Bub, T
    Schwinn, J
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2371 - 2374
  • [23] Multilingual Speech-to-Speech Translation System for Mobile Consumer Devices
    Yun, Seung
    Lee, Young-Jik
    Kim, Sang-Hun
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2014, 60 (03) : 508 - 516
  • [24] Direct Speech-to-Speech Translation With Discrete Units
    Lee, Ann
    Chen, Peng-Jen
    Wang, Changhan
    Gu, Jiatao
    Popuri, Sravya
    Ma, Xutai
    Polyak, Adam
    Adi, Yossi
    He, Qing
    Tang, Yun
    Pino, Juan
    Hsu, Wei-Ning
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3327 - 3339
  • [25] Pattern recognition approaches for speech-to-speech translation
    Casacuberta, F
    Vidal, E
    Sanchis, A
    Vilar, JM
    [J]. CYBERNETICS AND SYSTEMS, 2004, 35 (01) : 3 - 17
  • [26] From Speech-to-Speech Translation to Automatic Dubbing
    Federico, Marcello
    Enyedi, Robert
    Barra-Chicote, Roberto
    Giri, Ritwik
    Isik, Umut
    Krishnaswamy, Arvindh
    Sawaf, Hassan
    [J]. 17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 257 - 264
  • [27] The impact of ASR on speech-to-speech translation performance
    Sarikaya, Ruhi
    Zhou, Bowen
    Povey, Daniel
    Afify, Mohamed
    Gao, Yuqing
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1289 - +
  • [28] Semantic transfer in speech-to-speech machine translation
    Abb, B
    Buschbeck-Wolf, B
    Tschernitschek, C
    [J]. NATURAL LANGUAGE PROCESSING AND SPEECH TECHNOLOGY: RESULTS OF THE 3RD KONVENS CONFERENCE, 1996, : 123 - 136
  • [29] Speech-to-speech Low-resource Translation
    Liu, Hsiao-Chuan
    Day, Min-Yuh
    Wang, Chih-Chien
    [J]. 2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 91 - 95
  • [30] Finite-state speech-to-speech translation
    Vidal, E
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 111 - 114