Real-time Synchronization of Live speech with Its Transcription

被引:0
|
作者
Lertwongkhanakool, Nat [1 ]
Punyabukkana, Proadpran [1 ]
Suchato, Atiwong [1 ]
机构
[1] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Spoken Language Syst Res Grp, Bangkok, Thailand
关键词
Automatic speech-text synchronization; Syllable Detection; Real-Time alignment; Live speech and transcription alignment; Endpoint Detection;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Most of the researches in synchronization of audio and text have been focusing on the synchronization at the level of utterance. However, to generate audio books in unstructured language like Thai from live speech, a finer level of synchronization is necessary. We propose an algorithm to synchronize live speech with its corresponding transcription in real time at syllabic unit. The proposed algorithm employs the syllable endpoint detection method and the syllable landmark detection method with bandlimited intensity as features. The experiment was conducted with LOTUS datasets and the results were compared with baseline ASR-based syllable detection. We evaluated our algorithm by measuring its error through error aberration, which is the difference of the actual number of syllables and the detected syllables for each phrase, and found average total error aberration of the proposed algorithm to outperform that of the baseline. The average total error aberrations are 11.54 and 34.21 for the proposed method and the baseline respectively. We also found the reference deviation from our method to be better than that of the baseline as well.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] REAL-TIME SPEECH CODING - COMMENT
    GOLD, B
    TIERNEY, J
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (03) : 466 - 468
  • [32] The Recognition of Whispered Speech in Real-Time
    Hendrickson, Kristi
    Ernest, Danielle
    EAR AND HEARING, 2022, 43 (02): : 554 - 562
  • [33] Real-time interfaces for speech and singing
    Hunt, A
    Howard, D
    Worsdall, J
    PROCEEDINGS OF THE 26TH EUROMICRO CONFERENCE, VOLS I AND II, 2000, : A356 - A361
  • [34] Real-Time Lip Synchronization Between Text-To-Speech (TTS) System and Robot Mouth
    Oh, Kyung-Geune
    Jung, Chan-Yul
    Lee, Yong-Gyu
    Kim, Seung-Jong
    2010 IEEE RO-MAN, 2010, : 620 - 625
  • [35] A STRUCTURE FOR REAL-TIME STENOTYPE TRANSCRIPTION
    NEWITT, JW
    ODARCHEN.A
    IBM SYSTEMS JOURNAL, 1970, 9 (01) : 24 - &
  • [36] The real-time multimedia synchronization mechanism for Internet
    Wei, XS
    Cao, DZ
    Xu, YX
    2000 IEEE ASIA-PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS: ELECTRONIC COMMUNICATION SYSTEMS, 2000, : 787 - 790
  • [37] Adaptive synchronization for real-time multimedia applications
    Osman, AM
    Darwish, AM
    Shaheen, SI
    MULTIMEDIA SYSTEMS AND APPLICATIONS-BOOK, 1999, 3528 : 202 - 213
  • [38] Pragmatic nonblocking synchronization for real-time systems
    Hohmuth, M
    Härtig, H
    USENIX ASSOCIATION PROCEEDINGS OF THE 2001 USENIX ANNUAL TECHNICAL CONFERENCE, 2001, : 217 - 230
  • [39] CLOCK SYNCHRONIZATION IN DISTRIBUTED REAL-TIME SYSTEMS
    KOPETZ, H
    OCHSENREITER, W
    IEEE TRANSACTIONS ON COMPUTERS, 1987, 36 (08) : 933 - 940
  • [40] Synchronization protocols in distributed real-time systems
    Sun, J
    Liu, J
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 1996, : 38 - 45