Recognizing disfluencies in conversational speech

被引：21

作者：

Lease, Matthew ^{[1
]}

Johnson, Mark

Charniak, Eugene

机构：

[1] Brown Univ, BLLIP, Dept Comp Sci, Providence, RI 02912 USA

[2] Brown Univ, BLLIP, Dept Cognit & Linguist Sci, Providence, RI 02912 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2006年 / 14卷 / 05期

基金：

美国国家科学基金会;

关键词：

disfluency modeling; natural language processing; rich transcription; speech processing;

D O I：

10.1109/TASL.2006.878269

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We present a system for modeling disfluency in conversational speech: repairs, fillers, and self-interruption points (IPs). For each sentence, candidate repair analyses are generated by a stochastic tree adjoining grammar (TAG) noisy-channel model. A probabilistic syntactic language model scores the fluency of each analysis, and a maximum-entropy model selects the most likely analysis given the language model score and other features. Fillers are detected independently via a small set of deterministic rules, and IN are detected by combining the output of repair and filler detection modules. In the recent Rich Transcription Fall 2004 (RT-04F) blind evaluation, systems competed to detect these three forms of disfluency under two input conditions: a best-case scenario of manually transcribed words and a fully automatic case of automatic speech recognition (ASR) output. For all three tasks and on both types of input, our system was the top performer in the evaluation.

引用

页码：1566 / 1573

页数：8

共 50 条

[1] Modeling disfluencies in conversational speech
Siu, M
Ostendorf, M
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 386 - 389
[2] Micro-Structure of Disfluencies: Basics for Conversational Speech Synthesis
Betz, Simon
Wagner, Petra
Schlangen, David
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2222 - 2226
[3] Progress in recognizing conversational telephone speech
Peskin, B
Gillick, L
Liberman, N
Newman, M
vanMulbregt, P
Wegmann, S
[J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1811 - 1814
[4] On not recognizing disfluencies in dialogue
Lickley, RJ
Bard, EG
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 1876 - 1879
[5] Latent Prosodic Modeling (LPM) for Speech with Applications in Recognizing Spontaneous Mandarin Speech with Disfluencies
Lin, Che-Kuang
Lee, Lin-Shan
[J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2390 - 2393
[6] A study on acoustic modeling of pauses for recognizing noisy conversational speech
Zhang, JS
Markov, K
Matsui, T
Nakamura, S
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2003, E86D (03): : 489 - 496
[7] Disfluencies in cluttered speech
Myers, Florence L.
Bakker, Klaas
St Louis, Kenneth O.
Raphael, Lawrence J.
[J]. JOURNAL OF FLUENCY DISORDERS, 2012, 37 (01) : 9 - 19
[8] Disfluencies in the speech of intoxicated speakers
Schiel, Florian
Heinrich, Christian
[J]. INTERNATIONAL JOURNAL OF SPEECH LANGUAGE AND THE LAW, 2015, 22 (01) : 19 - 33
[9] LOCI OF DISFLUENCIES IN SPEECH OF STUTTERERS
SILVERMAN, FH
WILLIAMS, DE
[J]. PERCEPTUAL AND MOTOR SKILLS, 1967, 24 (3P2) : 1085 - +
[10] VARIATIONS IN NORMAL SPEECH DISFLUENCIES
BROEN, PA
SIEGEL, GM
[J]. LANGUAGE AND SPEECH, 1972, 15 (JUL-S) : 219 - 231

← 1 2 3 4 5 →