Input segmentation of spontaneous speech in JANUS: A speech-to-speech translation system

被引：0

作者：

Lavie, A ^{[1
]}

Gates, D ^{[1
]}

Coccaro, N ^{[1
]}

Levin, L ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Ctr Machine Translat, Pittsburgh, PA 15213 USA

来源：

DIALOGUE PROCESSING IN SPOKEN LANGUAGE SYSTEMS | 1997年 / 1236卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In this paper we describe how multi-level segmentation of single utterance turns improves translation quality and facilitates accurate translation in our system. We define the basic dialogue units that are handled by our system, and discuss the cues and methods employed by the system in segmenting the input utterance into such units. Utterance segmentation in our system is performed in a multi-level incremental fashion, partly prior and partly during analysis by the parser. The segmentation relies on a combination of acoustic, lexical, semantic and statistical knowledge sources, which are described in detail in the paper. We also discuss how our system is designed to disambiguate among alternative possible input segmentations.

引用

页码：86 / 99

页数：14

共 50 条

[21] The Asian Network-based Speech-to-Speech Translation System
Sakti, Sakriani
Kimura, Noriyuki
Paul, Michael
Hori, Chiori
Sumita, Eiichiro
Nakamura, Satoshi
Park, Jun
Wutiwiwatchai, Chai
Xu, Bo
Riza, Hammam
Arora, Karunesh
Luong, Chi Mai
Li, Haizhou
[J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 507 - +
[22] VERBMOBIL: The evolution of a complex large speech-to-speech translation system
Bub, T
Schwinn, J
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2371 - 2374
[23] Multilingual Speech-to-Speech Translation System for Mobile Consumer Devices
Yun, Seung
Lee, Young-Jik
Kim, Sang-Hun
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2014, 60 (03) : 508 - 516
[24] Direct Speech-to-Speech Translation With Discrete Units
Lee, Ann
Chen, Peng-Jen
Wang, Changhan
Gu, Jiatao
Popuri, Sravya
Ma, Xutai
Polyak, Adam
Adi, Yossi
He, Qing
Tang, Yun
Pino, Juan
Hsu, Wei-Ning
[J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3327 - 3339
[25] Pattern recognition approaches for speech-to-speech translation
Casacuberta, F
Vidal, E
Sanchis, A
Vilar, JM
[J]. CYBERNETICS AND SYSTEMS, 2004, 35 (01) : 3 - 17
[26] From Speech-to-Speech Translation to Automatic Dubbing
Federico, Marcello
Enyedi, Robert
Barra-Chicote, Roberto
Giri, Ritwik
Isik, Umut
Krishnaswamy, Arvindh
Sawaf, Hassan
[J]. 17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 257 - 264
[27] The impact of ASR on speech-to-speech translation performance
Sarikaya, Ruhi
Zhou, Bowen
Povey, Daniel
Afify, Mohamed
Gao, Yuqing
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1289 - +
[28] Semantic transfer in speech-to-speech machine translation
Abb, B
Buschbeck-Wolf, B
Tschernitschek, C
[J]. NATURAL LANGUAGE PROCESSING AND SPEECH TECHNOLOGY: RESULTS OF THE 3RD KONVENS CONFERENCE, 1996, : 123 - 136
[29] Speech-to-speech Low-resource Translation
Liu, Hsiao-Chuan
Day, Min-Yuh
Wang, Chih-Chien
[J]. 2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 91 - 95
[30] Finite-state speech-to-speech translation
Vidal, E
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 111 - 114

← 1 2 3 4 5 →