Automatic Turn Segmentation in Spoken Conversations

被引:0
|
作者
Ivanov, Alexei V. [1 ]
Riccardi, Giuseppe [1 ]
机构
[1] Univ Trent, Dept Informat Engn & Comp Sci, Trento, Italy
关键词
spoken turn boundary; spoken dialogs; modulation spectrum; Bayesian information criterion; Kullback-Leibler divergence; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper we have studied the problem of detecting the spoken turn boundaries in human-human spoken conversations. The automation of this task is essential to enable the analysis, recognition and understanding of the speech transcriptions and dialog structures (e.g. turn taking, dialog act segmentation etc.). The problem formulation is different from previous work on metadata extraction in that we work on the time domain for the detection of boundaries. This approach has the advantage of giving fine grain measures of speech events and does not rely on the automatic speech transcriptions. We have explored applicability of different algorithms for this task and have found that a hidden Markov model combining results of the modulation spectrum analysis and Kullback-Leibler divergence of adjacent signal portions produces the best results. The performance of the algorithms has been evaluated on the Switchboard conversational speech corpus.
引用
收藏
页码:3130 / 3133
页数:4
相关论文
共 50 条
  • [41] Text segmentation of spoken meeting transcripts
    Sharp, Bernadette
    Chibelushi, Caroline
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2008, 11 (3-4) : 157 - 165
  • [42] Prosodic segmentation for parsing spoken dialogue
    Nielsen, Elizabeth
    Steedman, Mark
    Goldwater, Sharon
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 979 - 992
  • [43] Speech segmentation and spoken document processing
    University of Washington, Washington, DC, United States
    不详
    不详
    不详
    不详
    不详
    不详
    不详
    不详
    不详
    不详
    不详
    IEEE Signal Process Mag, 2008, 3 (59-69):
  • [44] Topic Segmentation and Labeling in Asynchronous Conversations
    Joty, Shafiq
    Carenini, Giuseppe
    Ng, Raymond T.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 : 521 - 573
  • [45] Unsupervised speaker segmentation in telephone conversations
    Cohen, A
    Lapidus, V
    NINETEENTH CONVENTION OF ELECTRICAL AND ELECTRONICS ENGINEERS IN ISRAEL, 1996, : 102 - 105
  • [46] Uncovering Spoken Phrases in Encrypted Voice over IP Conversations
    Wright, Charles V.
    Ballard, Lucas
    Coull, Scott E.
    Monrose, Fabian
    Masson, Gerald M.
    ACM TRANSACTIONS ON INFORMATION AND SYSTEM SECURITY, 2010, 13 (04)
  • [47] Initial experiments on automatic story segmentation in Chinese spoken documents using lexical cohesion of extracted named entities
    Li, Devon
    Lo, Wai-Kit
    Meng, Helen
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 693 - +
  • [48] Universal attribute characterization of spoken languages for automatic spoken language recognition
    Siniscalchi, Sabato Marco
    Reed, Jeremy
    Svendsen, Torbjorn
    Lee, Chin-Hui
    COMPUTER SPEECH AND LANGUAGE, 2013, 27 (01): : 209 - 227
  • [49] Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries
    Mehdad, Yashar
    Carenini, Giuseppe
    Ng, Raymond T.
    PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2014, : 1220 - 1230
  • [50] Negation Detection in Dutch Spoken Human-Computer Conversations
    Sweers, Tom
    Hendrickx, Iris
    Strik, Helmer
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 534 - 542