High-quality bilingual subtitle document alignments with application to spontaneous speech translation

被引:4
|
作者
Tsiartas, Andreas [1 ]
Ghosh, Prasanta [1 ]
Georgiou, Panayiotis [1 ]
Narayanan, Shrikanth [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Signal Anal & Interpretat Lab, Los Angeles, CA 90089 USA
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 02期
基金
美国国家科学基金会;
关键词
Movie subtitle alignment; Spontaneous speech translation;
D O I
10.1016/j.csl.2011.10.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the task of translating spontaneous speech transcriptions by employing aligned movie subtitles in training a statistical machine translator (SMT). In contrast to the lexical-based dynamic time warping (DTW) approaches to bilingual subtitle alignment, we align subtitle documents using time-stamps. We show that subtitle time-stamps in two languages are often approximately linearly related, which can be exploited for extracting high-quality bilingual subtitle pairs. On a small tagged data-set, we achieve a performance improvement of 0.21 F-score points compared to traditional DTW alignment approach and 0.39 F-score points compared to a simple line-fitting approach. In addition, we achieve a performance gain of 4.88 BLEU score points in spontaneous speech translation experiments using the aligned subtitle data obtained by the proposed alignment approach compared to that obtained by the DTW based alignment approach demonstrating the merit of the time-stamps based subtitle alignment scheme. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:572 / 591
页数:20
相关论文
共 50 条
  • [21] High-quality text-to-speech synthesis: An overview
    Dutoit, T.
    Journal of Electrical and Electronics Engineering, Australia, 1997, 17 (01): : 25 - 36
  • [22] MULTIPOINT TELECONFERENCE SYSTEM PROVIDING HIGH-QUALITY SPEECH
    SHIMADA, S
    TAKA, M
    SUZUKI, J
    REVIEW OF THE ELECTRICAL COMMUNICATIONS LABORATORIES, 1988, 36 (01): : 57 - 62
  • [23] FlexVoice:: A parametric approach to high-quality speech synthesis
    Balogh, G
    Dobler, E
    Grobler, T
    Smodics, B
    Szepesvári, C
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2000, 1902 : 189 - 194
  • [24] HIGH-QUALITY SPEECH COMPRESSION-EXPANSION METHOD
    JOHNSON, O
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1962, 34 (05): : 725 - &
  • [25] Document-Oriented Middleware: The Way to High-Quality Software
    Kral, Jaroslav
    Pitner, Tomas
    Zemlicka, Michal
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2017, PT V, 2017, 10408 : 607 - 619
  • [26] BOLOGNA TRANSLATION SERVICE: HIGH-QUALITY AUTOMATED TRANSLATION OF STUDY PROGRAMMES INTO ENGLISH
    Van de Walle, Joeri
    Depraetere, Heidi
    Pietrzak, Justyna
    EDULEARN13: 5TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2013, : 2061 - 2070
  • [27] Quantifying the Effect of Machine Translation in a High-Quality Human Translation Production Process
    Macken, Lieve
    Prou, Daniel
    Tezcan, Arda
    INFORMATICS-BASEL, 2020, 7 (02):
  • [28] VOCODER AND ITS APPLICATION TO THE TRANSMISSION OF HIGH-QUALITY SPEECH OVER NARROW-BAND CHANNELS
    SCHROEDER, MR
    DAVID, EE
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1959, 31 (01): : 113 - 113
  • [29] ENHANCING THE FUZZY SET MODEL FOR HIGH-QUALITY DOCUMENT RANKINGS
    LEE, JH
    KIM, MH
    LEE, YJ
    MICROPROCESSING AND MICROPROGRAMMING, 1992, 35 (1-5): : 337 - 344
  • [30] BOLOGNA TRANSLATION SERVICE: HIGH-QUALITY AUTOMATED TRANSLATION OF STUDY PROGRAMMES INTO ENGLISH
    Van de Walle, Joeri
    Depraetere, Heidi
    Pietrzak, Justyna
    EDULEARN12: 4TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2012, : 5831 - 5835