High-quality bilingual subtitle document alignments with application to spontaneous speech translation

被引:4
|
作者
Tsiartas, Andreas [1 ]
Ghosh, Prasanta [1 ]
Georgiou, Panayiotis [1 ]
Narayanan, Shrikanth [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Signal Anal & Interpretat Lab, Los Angeles, CA 90089 USA
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 02期
基金
美国国家科学基金会;
关键词
Movie subtitle alignment; Spontaneous speech translation;
D O I
10.1016/j.csl.2011.10.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the task of translating spontaneous speech transcriptions by employing aligned movie subtitles in training a statistical machine translator (SMT). In contrast to the lexical-based dynamic time warping (DTW) approaches to bilingual subtitle alignment, we align subtitle documents using time-stamps. We show that subtitle time-stamps in two languages are often approximately linearly related, which can be exploited for extracting high-quality bilingual subtitle pairs. On a small tagged data-set, we achieve a performance improvement of 0.21 F-score points compared to traditional DTW alignment approach and 0.39 F-score points compared to a simple line-fitting approach. In addition, we achieve a performance gain of 4.88 BLEU score points in spontaneous speech translation experiments using the aligned subtitle data obtained by the proposed alignment approach compared to that obtained by the DTW based alignment approach demonstrating the merit of the time-stamps based subtitle alignment scheme. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:572 / 591
页数:20
相关论文
共 50 条
  • [31] QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
    Adam Gudyś
    Sebastian Deorowicz
    Scientific Reports, 7
  • [32] Simple chained guide trees give high-quality protein multiple sequence alignments
    Boyce, Kieran
    Sievers, Fabian
    Higgins, Desmond G.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2014, 111 (29) : 10556 - 10561
  • [33] QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
    Gudys, Adam
    Deorowicz, Sebastian
    SCIENTIFIC REPORTS, 2017, 7
  • [34] A Reflection on the Cultivation of High-Quality Translation Talents in Jilin Province
    Chen Yanxu
    PROCEEDINGS OF THE SIXTH NORTHEAST ASIA INTERNATIONAL SYMPOSIUM ON LANGUAGE, LITERATURE AND TRANSLATION, 2017, : 548 - 553
  • [35] MULTIPOINT TELECONFERENCE SYSTEM PROVIDING HIGH-QUALITY SPEECH.
    Shimada, Shoji
    Taka, Masahiro
    Suzuki, Junji
    Reports of the Electrical Communication Laboratory, 1988, 36 (01): : 57 - 62
  • [36] Simplified aperiodicity representation for high-quality speech manipulation systems
    Kawahara, Hideki
    Morise, Masanori
    PROCEEDINGS OF 2012 IEEE 11TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) VOLS 1-3, 2012, : 579 - +
  • [38] 400bps High-Quality Speech Coding Algorithm
    Ma, Xiaofeng
    Li, Ye
    Jiang, Jingsai
    Zhang, Peng
    Fan, Yanhong
    Hao, Qiuyun
    2016 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C), 2016, : 256 - 259
  • [39] HIGH-QUALITY SYNTHETIC SPEECH GENERATION USING SYNCHRONIZED OSCILLATORS
    HASHIMOTO, K
    MOCHIDA, T
    SATO, Y
    KOBAYASHI, T
    SHIRAI, K
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1949 - 1956
  • [40] EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
    Miao, Chenfeng
    Liang, Shuang
    Liu, Zhencheng
    Chen, Minchuan
    Ma, Jun
    Wang, Shaojun
    Xiao, Jing
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139