Using spatial cues for meeting speech segmentation

被引:0
|
作者
Cheng, E [1 ]
Lukasiak, J [1 ]
Burnett, IS [1 ]
Stirling, D [1 ]
机构
[1] Univ Wollongong, Sch Elect Comp & Telecommun Engn, Wollongong, NSW 2500, Australia
关键词
D O I
10.1109/ICME.2005.1521432
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work investigates the validity and accuracy of using spatial cues with Time-Delay Estimation (TDE) as a method of segmenting multichannel recorded speech by speaker location. In environments such as meetings where speakers do not significantly alter position, segmentation by speaker location essentially leads to segmentation by speaker 'turn'. The proposed system calculates location information using TDEs and spatial cues extracted from multichannel meeting audio recordings. This location information is then input into a simple segmentation algorithm. Experiments have been performed on both theoretical and real meeting recordings with non-overlapping speakers, and theoretical recordings with overlapping speakers. Segmentation results reveal the most robust cue to be a combination of spatial information and TDEs. This cue combination leads to greater segmentation accuracy for classifying individual speakers and detecting overlapping sections than using spatial cues or time-delay information alone.
引用
收藏
页码:350 / 353
页数:4
相关论文
共 50 条
  • [1] Using spatial audio cues from speech excitation for meeting speech segmentation
    Cheng, Eva
    Burnett, Ian
    Ritz, Christian
    [J]. 2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 3067 - +
  • [2] Varying microphone patterns for meeting speech segmentation using spatial audio cues
    Cheng, Eva
    Burnett, Ian
    Ritz, Christian
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2006, PROCEEDINGS, 2006, 4261 : 221 - +
  • [3] Syllable Segmentation of Continuous Speech Using Auditory Attention Cues
    Kalinli, Ozlem
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 432 - 435
  • [4] Speech segmentation is facilitated by visual cues
    Cunillera, Toni
    Camara, Estela
    Laine, Matti
    Rodriguez-Fornells, Antoni
    [J]. QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2010, 63 (02): : 260 - 274
  • [5] Disambiguating durational cues for speech segmentation
    Monaghan, Padraic
    White, Laurence
    Merkx, Marjolein M.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2013, 134 (01): : EL45 - EL51
  • [6] The Use of Facial Cues for Speech Segmentation
    Mitchel, Aaron D.
    Weiss, Daniel J.
    [J]. PROCEEDINGS OF THE 36TH ANNUAL BOSTON UNIVERSITY CONFERENCE ON LANGUAGE DEVELOPMENT, VOLS 1 AND 2, 2012, : 361 - +
  • [7] Visual speech segmentation: using facial cues to locate word boundaries in continuous speech
    Mitchel, Aaron D.
    Weiss, Daniel J.
    [J]. LANGUAGE COGNITION AND NEUROSCIENCE, 2014, 29 (07) : 771 - 780
  • [8] Gesture, speech, and gaze cues for discourse segmentation
    Quek, F
    McNeill, D
    Bryll, R
    Kirbas, C
    Arslan, H
    McCullough, KE
    Furuyama, N
    Ansari, R
    [J]. IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, VOL II, 2000, : 247 - 254
  • [9] Phonotactic cues for segmentation of fluent speech by infants
    Mattys, SL
    Jusczyk, PW
    [J]. COGNITION, 2001, 78 (02) : 91 - 121
  • [10] Cooperation and conflict between metrical cues and phonotactic cues in speech segmentation
    Banel, MH
    Bacri, N
    [J]. ANNEE PSYCHOLOGIQUE, 1997, 97 (01): : 77 - 112