Statistical Vowelization of Arabic Text for Speech Synthesis in Speech-to-Speech Translation Systems

被引:0
|
作者
Gu, Liang [1 ]
Zhang, Wei [1 ]
Tahir, Lazkin [1 ]
Gao, Yuqing [1 ]
机构
[1] IBM Corp, Div Res, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vowelization presents a principle difficulty in building text-to-speech synthesizers for speech-to-speech translation systems. In this paper, a novel log-linear modeling method is proposed that takes into account vowel and diacritical information at both the word level and character level. A unique syllable based normalization algorithm is then introduced to enhance both word coverage and data consistency. A recursive data generation and model training scheme is further devised to jointly optimize speech synthesizers and vowelizers for an English-Arabic speech translation system. The diacritization error rate is reduced by over 50% in vowelization experiments.
引用
收藏
页码:509 / 512
页数:4
相关论文
共 50 条
  • [31] Applications of Language Modeling in Speech-To-Speech Translation
    Fu-Hua Liu
    Liang Gu
    Yuqing Gao
    Michael Picheny
    [J]. International Journal of Speech Technology, 2004, 7 (2-3) : 221 - 229
  • [32] Textless Speech-to-Speech Translation on Real Data
    Lee, Ann
    Gong, Hongyu
    Duquenne, Paul-Ambroise
    Schwenk, Holger
    Chen, Peng-Jen
    Wang, Changhan
    Popuri, Sravya
    Adi, Yossi
    Pino, Juan
    Gu, Jiatao
    Hsu, Wei-Ning
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 860 - 872
  • [33] Detection of Cognitive States and Their Correlation to Speech Recognition Performance in Speech-to-Speech Machine Translation Systems
    Akira, Hayakawa
    Haider, Fasih
    Cerrato, Loredana
    Campbell, Nick
    Luz, Saturnino
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2539 - 2543
  • [34] TECNOPARLA - Speech technologies for Catalan and its application to Speech-to-speech Translation
    Schulz, Henrik
    Costa-Jussa, Marta R.
    Fonollosa, Jose A. R.
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 319 - 320
  • [35] MLLP-VRAIN UPV systems for the IWSLT 2022 Simultaneous Speech Translation and Speech-to-Speech Translation tasks
    Iranzo-Sanchez, Javier
    Jorge, Javier
    Perez-Gonzalez-de-Marto, Alejandro
    Gimenez, Adria
    Garces Diaz-Munio, Goncal, V
    Baquero-Arnal, Pau
    Albert Silvestre-Cerda, Joan
    Civera, Jorge
    Sanchis, Albert
    Juan, Alfons
    [J]. PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2022), 2022, : 255 - 264
  • [36] Toward Affective Speech-to-Speech Translation: Strategy for Emotional Speech Recognition and Synthesis in Multiple Languages
    Akagi, Masato
    Han, Xiao
    Elbarougy, Reda
    Hamada, Yasuhiro
    Li, Junfeng
    [J]. 2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [37] Emotional Speech Recognition and Synthesis in Multiple Languages toward Affective Speech-to-Speech Translation System
    Akagi, Masato
    Han, Xiao
    Elbarougy, Reda
    Hamada, Yasuhiro
    Li, Junfeng
    [J]. 2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 574 - 577
  • [38] Input segmentation of spontaneous speech in JANUS: A speech-to-speech translation system
    Lavie, A
    Gates, D
    Coccaro, N
    Levin, L
    [J]. DIALOGUE PROCESSING IN SPOKEN LANGUAGE SYSTEMS, 1997, 1236 : 86 - 99
  • [39] A speech-to-speech translation system for Catalan, Spanish, and English
    Arranz, V
    Comelles, E
    Farwell, D
    Nadeu, C
    Padrell, J
    Febrer, A
    Alexander, D
    Peterson, K
    [J]. MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS, 2004, 3265 : 7 - 16
  • [40] Predicting dialogue acts for a speech-to-speech translation system
    Reithinger, N
    Engel, R
    Kipp, M
    Klesen, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 654 - 657