High-quality bilingual subtitle document alignments with application to spontaneous speech translation

被引:4
|
作者
Tsiartas, Andreas [1 ]
Ghosh, Prasanta [1 ]
Georgiou, Panayiotis [1 ]
Narayanan, Shrikanth [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Signal Anal & Interpretat Lab, Los Angeles, CA 90089 USA
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 02期
基金
美国国家科学基金会;
关键词
Movie subtitle alignment; Spontaneous speech translation;
D O I
10.1016/j.csl.2011.10.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the task of translating spontaneous speech transcriptions by employing aligned movie subtitles in training a statistical machine translator (SMT). In contrast to the lexical-based dynamic time warping (DTW) approaches to bilingual subtitle alignment, we align subtitle documents using time-stamps. We show that subtitle time-stamps in two languages are often approximately linearly related, which can be exploited for extracting high-quality bilingual subtitle pairs. On a small tagged data-set, we achieve a performance improvement of 0.21 F-score points compared to traditional DTW alignment approach and 0.39 F-score points compared to a simple line-fitting approach. In addition, we achieve a performance gain of 4.88 BLEU score points in spontaneous speech translation experiments using the aligned subtitle data obtained by the proposed alignment approach compared to that obtained by the DTW based alignment approach demonstrating the merit of the time-stamps based subtitle alignment scheme. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:572 / 591
页数:20
相关论文
共 50 条
  • [41] PortaSpeech: Portable and High-Quality Generative Text-to-Speech
    Ren, Yi
    Liu, Jinglin
    Zhao, Zhou
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [42] Generation and application of high-quality supercontinuum sources
    Nishizawa, Norihiko
    OPTICAL FIBER TECHNOLOGY, 2012, 18 (05) : 394 - 402
  • [43] High-quality synthetic diamonds for SR application
    Pal'yanov, YN
    Borzdov, YM
    Gusev, VA
    Sokol, AG
    Khokhryakov, AF
    Rylov, GM
    Chernov, VA
    Kupriyanov, IN
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2000, 448 (1-2): : 179 - 183
  • [44] A Study on DRM Application on High-quality Audio
    Moon, Jungwon
    Kim, Uk
    2018 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2018,
  • [45] A Maximum Likelihood Approach to the Detection of Moments of Maximum Excitation and its Application to High-Quality Speech Parameterization
    Maia, Ranniery
    Stylianou, Yannis
    Akamine, Masami
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 603 - 607
  • [46] SEGMENTAL INTELLIGIBILITY AND SPEECH INTERFERENCE THRESHOLDS OF HIGH-QUALITY SYNTHETIC SPEECH IN PRESENCE OF NOISE
    KOUL, RK
    ALLEN, GD
    JOURNAL OF SPEECH AND HEARING RESEARCH, 1993, 36 (04): : 790 - 798
  • [47] Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega
    Sievers, Fabian
    Wilm, Andreas
    Dineen, David
    Gibson, Toby J.
    Karplus, Kevin
    Li, Weizhong
    Lopez, Rodrigo
    McWilliam, Hamish
    Remmert, Michael
    Soeding, Johannes
    Thompson, Julie D.
    Higgins, Desmond G.
    MOLECULAR SYSTEMS BIOLOGY, 2011, 7
  • [48] Successful high-quality knowledge translation research: three case studies
    Majumdar, Sumit R.
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2011, 64 (01) : 21 - 24
  • [49] A VERY SIMPLE AND EFFICIENT WEIGHTING FILTER WITH APPLICATION TO A CELP CODER FOR HIGH-QUALITY SPEECH AT 4800 BITS/S
    BOITE, R
    LEICH, H
    GAO, Y
    SIGNAL PROCESSING, 1992, 27 (02) : 109 - 116
  • [50] VOC - AN INTEGRATED HIGH-QUALITY SPEECH SYNTHESIZER BASED ON LPC TECHNIQUES
    ITALIANO, P
    PONTE, G
    SARTORI, M
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 1985, 31 (03) : 501 - 504