Incremental TTS for Japanese Language

被引:4
|
作者
Yanagita, Tomoya [1 ]
Sakti, Sakriani [1 ,2 ]
Nakamura, Satoshi [1 ,2 ]
机构
[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Ikoma, Japan
[2] RIKEN, Ctr Adv Intelligence Project AIP, Wako, Saitama, Japan
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
Incremental speech synthesis; linguistic and temporal locality features; HMM based speech synthesis;
D O I
10.21437/Interspeech.2018-1561
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Simultaneous lecture translation requires speech to be translated in real time before the speaker has spoken an entire sentence since a long delay will create difficulties for the listeners trying to follow the lecture. The challenge is to construct a full-fledged system with speech recognition, machine translation, and text-to-speech synthesis (TTS) components that could produce high quality speech translations on the fly. Specifically for a TTS, this poses problems as a conventional framework commonly requires the language-dependent contextual linguistics of a full sentence to produce a natural-sounding speech waveform. Several studies have proposed ways for an incremental TTS (TITS), in which it can estimate the target prosody from only partial knowledge of the sentence. However, most investigations are being done only in French, English, and German. French is a syllable-timed language and the others are stress-timed languages. The Japanese language, which is a mora-timed language, has not been investigated so far. In this paper, we evaluate the quality of Japanese synthesized speech based on various linguistic and temporal incremental units. Experimental results reveal that an accent phrase incremental unit (a group of moras) is essential for a Japanese ITTS as a trade-off between quality and synthesis units.
引用
收藏
页码:902 / 906
页数:5
相关论文
共 50 条
  • [31] Incremental processing in a polysynthetic language (Murrinhpatha)
    Bruggeman, Laurence
    Kidd, Evan
    Nordlinger, Rachel
    Cutler, Anne
    COGNITION, 2025, 257
  • [32] Incremental language modeling for broadcast news
    Ohtsuki, K
    Nguyen, L
    2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2005, : 139 - 144
  • [33] Speech-rate-variable HMM-based Japanese TTS system
    Iwano, K
    Yamada, M
    Togawa, T
    Furui, S
    PROCEEDINGS OF THE 2002 IEEE WORKSHOP ON SPEECH SYNTHESIS, 2002, : 219 - 222
  • [34] Incremental relevance feedback in Japanese text retrieval
    Jones G.
    Sakai T.
    Kajiura M.
    Sumita K.
    Information Retrieval, 2000, 2 (4): : 361 - 384
  • [35] Japanese? Language? and Gender?
    Shibamoto-Smith, Janet S.
    GENDER AND LANGUAGE, 2021, 15 (04) : 582 - 590
  • [36] THE LANGUAGE OF JAPANESE TOURISM
    MOERAN, B
    ANNALS OF TOURISM RESEARCH, 1983, 10 (01) : 93 - 108
  • [37] UNORTHODOX JAPANESE + LANGUAGE
    不详
    EAST, 1979, 15 (3-4): : 46 - 49
  • [38] PROGRESS AND THE JAPANESE LANGUAGE
    MORIHARA, Y
    CREATIVE COMPUTING, 1984, 10 (08): : 73 - &
  • [39] A History of the Japanese Language
    Mccreary, Don R.
    JOURNAL OF SOCIOLINGUISTICS, 2012, 16 (05) : 703 - 704
  • [40] GIBNEY JAPANESE LANGUAGE
    SHIMAMOTO, R
    ENCOUNTER, 1975, 44 (05): : 94 - &