Multilingual Web Conferencing Using Speech-to-Speech Translation

被引:0
|
作者
Chen, John [1 ]
Wen, Shufei [1 ]
Sridhar, Vivek Kumar Rangarajan [1 ]
Bangalore, Srinivas [1 ]
机构
[1] AT&T Labs Res, 180 Pk Ave, Florham Pk, NJ 07932 USA
关键词
Web conferencing; simultaneous speech-to-speech translation; session initiation protocol (SIP);
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is now commonplace to use web conferencing technology in order to hold meetings between participants situated in different physical locations. A drawback of this technology is that nearly all of the interaction between these participants is monolingual. Here, we demonstrate a novel form of this technology that enables cross-lingual speech-to-speech communication between conference participants in real time. We model this translation problem as a combination of incremental speech recognition and segmentation, addressing the question of finding which segmentation strategy maximizes translation accuracy while minimizing latency. Our demonstration takes the form of a web conferencing scenario where a presenter speaks in one language while talk participants listen to or read the speaker's translated texts in real time. This system is flexible enough to allow real-time translation of technical talks or speeches covering broad topics.
引用
收藏
页码:1860 / 1862
页数:3
相关论文
共 50 条
  • [1] Multilingual speech-to-speech translation system: VoiceTra
    Matsuda, Shigeki
    Hu, Xinhui
    Shiga, Yoshinori
    Kashioka, Hideki
    Hori, Chiori
    Yasuda, Keiji
    Okuma, Hideo
    Uchiyama, Masao
    Sumita, Eiichiro
    Kawai, Hisashi
    Nakamura, Satoshi
    [J]. 2013 IEEE 14TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2013), VOL 2, 2013, : 229 - 233
  • [2] The ATR multilingual speech-to-speech translation system
    Nakamura, S
    Markov, K
    Nakaiwa, H
    Kikui, G
    Kawai, H
    Jitsuhiro, T
    Zhang, JS
    Yamamoto, H
    Sumita, E
    Yamamoto, S
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 365 - 376
  • [3] CVSS Corpus and Massively Multilingual Speech-to-Speech Translation
    Jia, Ye
    Ramanovich, Michelle Tadmor
    Wang, Quan
    Zen, Heiga
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6691 - 6703
  • [4] Rhonda: the architecture of a multilingual speech-to-speech translation pipeline
    Louw, Johannes A.
    Moodley, Avashlin
    [J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND INNOVATIVE COMPUTING APPLICATIONS (ICONIC), 2018, : 194 - 200
  • [5] Multilingual Speech-to-Speech Translation System for Mobile Consumer Devices
    Yun, Seung
    Lee, Young-Jik
    Kim, Sang-Hun
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2014, 60 (03) : 508 - 516
  • [6] Generating Arabic text in multilingual speech-to-speech machine translation framework
    Monem, Azza Abdel
    Shaalan, Khaled
    Rafea, Ahmed
    Baraka, Hoda
    [J]. MACHINE TRANSLATION, 2008, 22 (04) : 205 - 258
  • [7] Multilingual generation for translation in speech-to-speech dialogues and its realization in verbmobil
    Becker, T
    Kilger, A
    Lopez, P
    Poller, P
    [J]. ECAI 2000: 14TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2000, 54 : 401 - 405
  • [8] Developing high performance ASR in the IBM multilingual speech-to-speech translation system
    Cui, Xiaodong
    Gu, Liang
    Xiang, Bing
    Zhang, Wei
    Gao, Yuqing
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 5121 - 5124
  • [9] AwezaMed: A Multilingual, Multimodal Speech-To-Speech Translation Application for Maternal Health Care
    Marais, Laurette
    Louw, Johannes A.
    Badenhorst, Jaco
    Calteaux, Karen
    Wilken, Ilana
    van Niekerk, Nina
    Stein, Glenn
    [J]. PROCEEDINGS OF 2020 23RD INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2020), 2020, : 669 - 676
  • [10] Impacts of machine translation and speech synthesis on speech-to-speech translation
    Hashimoto, Kei
    Yamagishi, Junichi
    Byrne, William
    King, Simon
    Tokuda, Keiichi
    [J]. SPEECH COMMUNICATION, 2012, 54 (07) : 857 - 866