Statistical Machine Translation for Speech: A Perspective on Structures, Learning, and Decoding

被引:5
|
作者
Zhou, Bowen [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
Discriminative training; finite-state transducer (FST); graph; hypergraph; speech translation (ST); statistical machine translation (SMT); synchronous context-free grammar (SCFG); Viterbi search; RECOGNITION;
D O I
10.1109/JPROC.2013.2249491
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we survey and analyze state-of-theart statistical machine translation (SMT) techniques for speech translation (ST). We review key learning problems, and investigate essential model structures in SMT, taking a unified perspective to reveal both connections and contrasts between automatic speech recognition (ASR) and SMT. We show that phrase-based SMT can be viewed as a sequence of finite-state transducer (FST) operations, similar in spirit to ASR. We further inspect the synchronous context-free grammar (SCFG)-based formalism that includes hierarchical phrase-based and many linguistically syntax-based models. Decoding for ASR, FST-based, and SCFG-based translation is also presented from a unified perspective as different realizations of the generic Viterbi algorithm on graphs or hypergraphs. These consolidated perspectives are helpful to catalyze tighter integrations for improved ST, and we discuss joint decoding and modeling toward coupling ASR and SMT.
引用
收藏
页码:1180 / 1202
页数:23
相关论文
共 50 条
  • [1] Machine Learning Based Optimized Pruning Approach for Decoding in Statistical Machine Translation
    Banik, Debajyoty
    Ekbal, Asif
    Bhattacharyya, Pushpak
    IEEE ACCESS, 2019, 7 : 1736 - 1751
  • [2] Decoding algorithm in statistical machine translation
    Wang, YY
    Waibel, A
    35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 366 - 372
  • [3] A decoding algorithm for speech input statistical translation
    García-Varea, I
    Sanchis, A
    Casacuberta, F
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 307 - 314
  • [4] Decoding silent speech: a machine learning perspective on data, methods, and frameworks
    Adiba Tabassum Chowdhury
    Mehrin Newaz
    Purnata Saha
    Mohannad Natheef AbuHaweeleh
    Sara Mohsen
    Diala Bushnaq
    Malek Chabbouh
    Raghad Aljindi
    Shona Pedersen
    Muhammad E. H. Chowdhury
    Neural Computing and Applications, 2025, 37 (10) : 6995 - 7013
  • [5] Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective
    Dash, Debadatta
    Ferrari, Paul
    Malik, Saleem
    Montillo, Albert
    Maldjian, Joseph A.
    Wang, Jun
    BRAIN INFORMATICS, BI 2018, 2018, 11309 : 163 - 172
  • [6] EDA: An evolutionary decoding algorithm for statistical machine translation
    Otto, Eridan
    Riff, Maria Cristina
    APPLIED ARTIFICIAL INTELLIGENCE, 2007, 21 (07) : 605 - 621
  • [7] Online Learning for Statistical Machine Translation
    Ortiz-Martinez, Daniel
    COMPUTATIONAL LINGUISTICS, 2016, 42 (01) : 121 - 161
  • [8] Statistical machine translation decoding using target word reordering
    Tomás, J
    Casacuberta, F
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2004, 3138 : 734 - 743
  • [9] Minimum Bayes-risk decoding for statistical machine translation
    Kumar, S
    Byrne, W
    HLT-NAACL 2004: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2004, : 169 - 176
  • [10] Consensus network decoding for statistical Machine Translation system combination
    Sim, K. C.
    Byrne, W. J.
    Gales, M. J. F.
    Sahbi, H.
    Woodland, P. C.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 105 - +