Real-time Synchronization of Live speech with Its Transcription

被引:0
|
作者
Lertwongkhanakool, Nat [1 ]
Punyabukkana, Proadpran [1 ]
Suchato, Atiwong [1 ]
机构
[1] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Spoken Language Syst Res Grp, Bangkok, Thailand
关键词
Automatic speech-text synchronization; Syllable Detection; Real-Time alignment; Live speech and transcription alignment; Endpoint Detection;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Most of the researches in synchronization of audio and text have been focusing on the synchronization at the level of utterance. However, to generate audio books in unstructured language like Thai from live speech, a finer level of synchronization is necessary. We propose an algorithm to synchronize live speech with its corresponding transcription in real time at syllabic unit. The proposed algorithm employs the syllable endpoint detection method and the syllable landmark detection method with bandlimited intensity as features. The experiment was conducted with LOTUS datasets and the results were compared with baseline ASR-based syllable detection. We evaluated our algorithm by measuring its error through error aberration, which is the difference of the actual number of syllables and the detected syllables for each phrase, and found average total error aberration of the proposed algorithm to outperform that of the baseline. The average total error aberrations are 11.54 and 34.21 for the proposed method and the baseline respectively. We also found the reference deviation from our method to be better than that of the baseline as well.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] RTTS: Towards Enterprise-level Real-Time Speech Transcription and Translation Services
    Huerta, Juan M.
    Wu, Cheng
    Sakrajda, Andrej
    Caskey, Sasha
    Jan, Ea-Ee
    Faisman, Alexander
    Ben-David, Shai
    Liu, Wen
    Lee, Antonio
    Stewart, Osamuyimen
    Frissora, Michael
    Lubensky, David
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 436 - 439
  • [22] Design of time synchronization method for real-time EPON
    Qu, Junsuo
    Telkomnika - Indonesian Journal of Electrical Engineering, 2013, 11 (07): : 3517 - 3522
  • [23] SYNCHRONIZATION AND TIME TAGGING IN DISTRIBUTED REAL-TIME SIMULATION
    KATZ, A
    ALLEN, DM
    DICKSON, JS
    AIAA FLIGHT SIMULATION TECHNOLOGIES CONFERENCE AND EXHIBITION: A COLLECTION OF TECHNICAL PAPERS, 1989, : 259 - 261
  • [24] SYNCHRONIZATION AND TIME TAGGING IN DISTRIBUTED REAL-TIME SIMULATION
    KATZ, A
    ALLEN, DM
    DICKSON, JS
    JOURNAL OF AIRCRAFT, 1990, 27 (09): : 846 - 848
  • [25] A precise time synchronization method for real-time schedulers
    Kikutani, Tatsushi
    Yakoh, Takahiro
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2018, 101 (12) : 21 - 29
  • [26] A precise time synchronization method for real-time schedulers
    Kikutani T.
    Yakoh T.
    Yakoh, Takahiro (yakoh@sd.keio.ac.jp), 2018, Institute of Electrical Engineers of Japan (138) : 695 - 702
  • [27] Real-time speech-to-speech translation for PDAs
    Prasad, R.
    Krstovski, K.
    Choi, F.
    Saleem, S.
    Natarajan, P.
    Decerbo, M.
    Stallard, D.
    2007 IEEE INTERNATIONAL CONFERENCE ON PORTABLE INFORMATION DEVICES, 2007, : 95 - 99
  • [28] Real-Time Statistical Speech Translation
    Wolk, Krzysztof
    Marasek, Krzysztof
    NEW PERSPECTIVES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2014, 275 : 107 - 113
  • [29] REAL-TIME SPEECH SYNTHESIS SYSTEM
    AINSWORTH, WA
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1972, AU20 (05): : 397 - +
  • [30] Recommendations for real-time speech MRI
    Lingala, Sajan Goud
    Sutton, Brad P.
    Miquel, Marc E.
    Nayak, Krishna S.
    JOURNAL OF MAGNETIC RESONANCE IMAGING, 2016, 43 (01) : 28 - 44