Automatic Turn Segmentation in Spoken Conversations

被引：0

作者：

Ivanov, Alexei V. ^{[1
]}

Riccardi, Giuseppe ^{[1
]}

机构：

[1] Univ Trent, Dept Informat Engn & Comp Sci, Trento, Italy

来源：

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 | 2010年

关键词：

spoken turn boundary; spoken dialogs; modulation spectrum; Bayesian information criterion; Kullback-Leibler divergence; SPEECH;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper we have studied the problem of detecting the spoken turn boundaries in human-human spoken conversations. The automation of this task is essential to enable the analysis, recognition and understanding of the speech transcriptions and dialog structures (e.g. turn taking, dialog act segmentation etc.). The problem formulation is different from previous work on metadata extraction in that we work on the time domain for the detection of boundaries. This approach has the advantage of giving fine grain measures of speech events and does not rely on the automatic speech transcriptions. We have explored applicability of different algorithms for this task and have found that a hidden Markov model combining results of the modulation spectrum analysis and Kullback-Leibler divergence of adjacent signal portions produces the best results. The performance of the algorithms has been evaluated on the Switchboard conversational speech corpus.

引用

页码：3130 / 3133

页数：4

共 50 条

[41] Text segmentation of spoken meeting transcripts
Sharp, Bernadette
Chibelushi, Caroline
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (3-4) : 157 - 165
[42] Prosodic segmentation for parsing spoken dialogue
Nielsen, Elizabeth
Steedman, Mark
Goldwater, Sharon
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 979 - 992
[43] Speech segmentation and spoken document processing
University of Washington, Washington, DC, United States
不详
不详
不详
不详
不详
不详
不详
不详
不详
不详
不详
IEEE Signal Process Mag, 2008, 3 (59-69):
[44] Topic Segmentation and Labeling in Asynchronous Conversations
Joty, Shafiq
Carenini, Giuseppe
Ng, Raymond T.
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 : 521 - 573
[45] Unsupervised speaker segmentation in telephone conversations
Cohen, A
Lapidus, V
NINETEENTH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, 1996, : 102 - 105
[46] Uncovering Spoken Phrases in Encrypted Voice over IP Conversations
Wright, Charles V.
Ballard, Lucas
Coull, Scott E.
Monrose, Fabian
Masson, Gerald M.
ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2010, 13 (04)
[47] Initial experiments on automatic story segmentation in Chinese spoken documents using lexical cohesion of extracted named entities
Li, Devon
Lo, Wai-Kit
Meng, Helen
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 693 - +
[48] Universal attribute characterization of spoken languages for automatic spoken language recognition
Siniscalchi, Sabato Marco
Reed, Jeremy
Svendsen, Torbjorn
Lee, Chin-Hui
COMPUTER SPEECH AND LANGUAGE, 2013, 27 (01): : 209 - 227
[49] Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
Mehdad, Yashar
Carenini, Giuseppe
Ng, Raymond T.
PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 1220 - 1230
[50] Negation Detection in Dutch Spoken Human-Computer Conversations
Sweers, Tom
Hendrickx, Iris
Strik, Helmer
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 534 - 542

← 1 2 3 4 5 →