Recognizing disfluencies in conversational speech

被引:21
|
作者
Lease, Matthew [1 ]
Johnson, Mark
Charniak, Eugene
机构
[1] Brown Univ, BLLIP, Dept Comp Sci, Providence, RI 02912 USA
[2] Brown Univ, BLLIP, Dept Cognit & Linguist Sci, Providence, RI 02912 USA
基金
美国国家科学基金会;
关键词
disfluency modeling; natural language processing; rich transcription; speech processing;
D O I
10.1109/TASL.2006.878269
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a system for modeling disfluency in conversational speech: repairs, fillers, and self-interruption points (IPs). For each sentence, candidate repair analyses are generated by a stochastic tree adjoining grammar (TAG) noisy-channel model. A probabilistic syntactic language model scores the fluency of each analysis, and a maximum-entropy model selects the most likely analysis given the language model score and other features. Fillers are detected independently via a small set of deterministic rules, and IN are detected by combining the output of repair and filler detection modules. In the recent Rich Transcription Fall 2004 (RT-04F) blind evaluation, systems competed to detect these three forms of disfluency under two input conditions: a best-case scenario of manually transcribed words and a fully automatic case of automatic speech recognition (ASR) output. For all three tasks and on both types of input, our system was the top performer in the evaluation.
引用
收藏
页码:1566 / 1573
页数:8
相关论文
共 50 条
  • [1] Modeling disfluencies in conversational speech
    Siu, M
    Ostendorf, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 386 - 389
  • [2] Micro-Structure of Disfluencies: Basics for Conversational Speech Synthesis
    Betz, Simon
    Wagner, Petra
    Schlangen, David
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2222 - 2226
  • [3] Progress in recognizing conversational telephone speech
    Peskin, B
    Gillick, L
    Liberman, N
    Newman, M
    vanMulbregt, P
    Wegmann, S
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1811 - 1814
  • [4] On not recognizing disfluencies in dialogue
    Lickley, RJ
    Bard, EG
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1876 - 1879
  • [5] Latent Prosodic Modeling (LPM) for Speech with Applications in Recognizing Spontaneous Mandarin Speech with Disfluencies
    Lin, Che-Kuang
    Lee, Lin-Shan
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2390 - 2393
  • [6] A study on acoustic modeling of pauses for recognizing noisy conversational speech
    Zhang, JS
    Markov, K
    Matsui, T
    Nakamura, S
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03): : 489 - 496
  • [7] Disfluencies in cluttered speech
    Myers, Florence L.
    Bakker, Klaas
    St Louis, Kenneth O.
    Raphael, Lawrence J.
    [J]. JOURNAL OF FLUENCY DISORDERS, 2012, 37 (01) : 9 - 19
  • [8] Disfluencies in the speech of intoxicated speakers
    Schiel, Florian
    Heinrich, Christian
    [J]. INTERNATIONAL JOURNAL OF SPEECH LANGUAGE AND THE LAW, 2015, 22 (01) : 19 - 33
  • [9] LOCI OF DISFLUENCIES IN SPEECH OF STUTTERERS
    SILVERMAN, FH
    WILLIAMS, DE
    [J]. PERCEPTUAL AND MOTOR SKILLS, 1967, 24 (3P2) : 1085 - +
  • [10] VARIATIONS IN NORMAL SPEECH DISFLUENCIES
    BROEN, PA
    SIEGEL, GM
    [J]. LANGUAGE AND SPEECH, 1972, 15 (JUL-S) : 219 - 231