Recovering Punctuation Marks for Automatic Speech Recognition

被引:0
|
作者
Batista, Fernando [1 ]
Caseiro, Diamantino [1 ]
Mamede, Nuno [1 ]
Trancoso, Isabel [1 ]
机构
[1] INESC ID Lisboa, Lab Sistemas Lingua Falada, P-1000029 Lisbon, Portugal
关键词
rich transcription; punctuation recovery; sentence boundary detection; maximum entropy;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper shows results of recovering punctuation over speech transcriptions for a Portuguese broadcast news corpus. The approach is based on maximum entropy models and uses word, part-of-speech, time and speaker information. The contribution of each type of feature is analyzed individually. Separate results for each focus condition are given, making it possible to analyze the differences of performance between planned and spontaneous speech.
引用
收藏
页码:1977 / 1980
页数:4
相关论文
共 50 条
  • [1] Recovering capitalization and punctuation marks for automatic speech recognition: Case study for Portuguese broadcast news
    L2F, Spoken Language Systems Laboratory, INESC ID Lisboa, R. Alves Redol, 9, 1000-029 Lisboa, Portugal
    不详
    不详
    [J]. Speech Commun, 2008, 10 (847-862):
  • [2] Automatic Punctuation Generation For Speech
    Shen, Wenzhu
    Yu, Roger Peng
    Seide, Frank
    Wu, Ji
    [J]. 2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 586 - +
  • [3] Fast and Accurate Capitalization and Punctuation for Automatic Speech Recognition Using Transformer and Chunk Merging
    Binh Nguyen
    Vu Bao Hung Nguyen
    Hien Nguyen
    Pham Ngoc Phuong
    The-Loc Nguyen
    Quoc Truong Do
    Luong Chi Mai
    [J]. 2019 22ND CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2019, : 29 - 33
  • [4] Punctuation Marks
    Eich, G.
    [J]. KENYON REVIEW, 2010, 32 (02): : 26 - 26
  • [5] PUNCTUATION MARKS
    ADORNO, TW
    [J]. ANTIOCH REVIEW, 1990, 48 (03): : 300 - 305
  • [6] Recovering Capitalization for Automatic Speech Recognition of Vietnamese using Transformer and Chunk Merging
    Hien Nguyen Thi Thu
    Binh Nguyen Thai
    Hung Nguyen Vu Bao
    Truong Do Quoc
    Mai Luong Chi
    Huyen Nguyen Thi Minh
    [J]. PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 430 - 434
  • [7] Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts
    Batista, Fernando
    Moniz, Helena
    Trancoso, Isabel
    Mamede, Nuno
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 474 - 485
  • [8] PUNCTUATION PREDICTION FOR STREAMING ON-DEVICE SPEECH RECOGNITION
    Zhou, Zhikai
    Tan, Tian
    Qian, Yanmin
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7277 - 7281
  • [9] PUNCTUATION MARKS AS LINGUISTICS SYMBOLS
    Kleppa, LouAnn
    [J]. CADERNOS DE ESTUDOS LINGUISTICOS, 2023, 65
  • [10] THE HISTORY OF PUNCTUATION MARKS IN SPANISH
    Borzenkova, A. A.
    Koteniatkina, I. B.
    [J]. VESTNIK ROSSIISKOGO UNIVERSITETA DRUZHBY NARODOV-SERIYA LINGVISTIKA-RUSSIAN JOURNAL OF LINGUISTICS, 2015, (02): : 148 - 155