Statistical Vowelization of Arabic Text for Speech Synthesis in Speech-to-Speech Translation Systems

被引:0
|
作者
Gu, Liang [1 ]
Zhang, Wei [1 ]
Tahir, Lazkin [1 ]
Gao, Yuqing [1 ]
机构
[1] IBM Corp, Div Res, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vowelization presents a principle difficulty in building text-to-speech synthesizers for speech-to-speech translation systems. In this paper, a novel log-linear modeling method is proposed that takes into account vowel and diacritical information at both the word level and character level. A unique syllable based normalization algorithm is then introduced to enhance both word coverage and data consistency. A recursive data generation and model training scheme is further devised to jointly optimize speech synthesizers and vowelizers for an English-Arabic speech translation system. The diacritization error rate is reduced by over 50% in vowelization experiments.
引用
收藏
页码:509 / 512
页数:4
相关论文
共 50 条
  • [21] Modern Arabic speech corpus for Text to Speech synthesis
    Oumaima, Zine
    Meziane, Abdelouafi
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON TECHNOLOGY MANAGEMENT, OPERATIONS AND DECISIONS (ICTMOD), 2020,
  • [22] The impact of ASR on speech-to-speech translation performance
    Sarikaya, Ruhi
    Zhou, Bowen
    Povey, Daniel
    Afify, Mohamed
    Gao, Yuqing
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1289 - +
  • [23] The ATR multilingual speech-to-speech translation system
    Nakamura, S
    Markov, K
    Nakaiwa, H
    Kikui, G
    Kawai, H
    Jitsuhiro, T
    Zhang, JS
    Yamamoto, H
    Sumita, E
    Yamamoto, S
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (02): : 365 - 376
  • [24] Semantic transfer in speech-to-speech machine translation
    Abb, B
    Buschbeck-Wolf, B
    Tschernitschek, C
    [J]. NATURAL LANGUAGE PROCESSING AND SPEECH TECHNOLOGY: RESULTS OF THE 3RD KONVENS CONFERENCE, 1996, : 123 - 136
  • [25] Speech-to-speech Low-resource Translation
    Liu, Hsiao-Chuan
    Day, Min-Yuh
    Wang, Chih-Chien
    [J]. 2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 91 - 95
  • [26] Finite-state speech-to-speech translation
    Vidal, E
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 111 - 114
  • [27] ASSESSING EVALUATION METRICS FOR SPEECH-TO-SPEECH TRANSLATION
    Salesky, Elizabeth
    Maeder, Julian
    Klinger, Severin
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 733 - 740
  • [28] Incremental Dialog Clustering For Speech-to-Speech Translation
    Stallard, David
    Tsakalidis, Stavros
    Saleem, Shirin
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 428 - 431
  • [29] A speech-to-speech translation based interface for tourism
    Cettolo, M
    Corazza, A
    Lazzari, G
    Pianesi, F
    Pianta, E
    Tovena, LM
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGIES IN TOURISM 1999, 1999, : 191 - 200
  • [30] INTENT TRANSFER IN SPEECH-TO-SPEECH MACHINE TRANSLATION
    Anumanchipalli, Gopala Krishna
    Oliveira, Luis C.
    Black, Alan W.
    [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 153 - 158