Speaker change detection in casual conversations using excitation source features

被引:4
|
作者
Dhananjaya, N. [1 ]
Yegnanarayana, B. [2 ]
机构
[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India
[2] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India
关键词
speaker change detection; multispeaker conversation; autoassociative neural network (AANN) models; excitation source features; linear prediction (LP) residual;
D O I
10.1016/j.specom.2007.08.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose a method for speaker change detection using features of excitation source of the speech production mechanism. The method uses neural network models to capture the speaker-specific information from a signal that represents predominantly the excitation source. The focus in this paper is on speaker change detection in casual telephone conversations, in which short (<5 s) speaker turns are common. Excitation source features are a better choice for modeling a speaker, when limited amount of speech data is available, when compared to the vocal tract system features. Linear prediction residual is used as an estimate of the excitation source signal. Autoassociative neural network models are proposed to capture the higher order relations among the samples of the residual signal. Speaker models are generated for every one second of voiced speech from the first few seconds of the conversation. These models are used to detect the speaker change points. Performance of the proposed method for speaker change detection is evaluated on a database containing several two-speaker conversations. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:153 / 161
页数:9
相关论文
共 50 条
  • [21] Improved Phone Recognition Using Excitation Source Features
    Hisham, P. M.
    Pravena, D.
    Pardhu, Y.
    Gokul, V.
    Abhitej, B.
    Govind, D.
    INTELLIGENT SYSTEMS TECHNOLOGIES AND APPLICATIONS, VOL 1, 2016, 384 : 147 - 152
  • [22] Infant Cry Recognition using Excitation Source Features
    Singh, Avinash Kumar
    Mukhopadhyay, Jayanta
    Kumar, Sunil S. B.
    Rao, K. Sreenivasa
    2013 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2013,
  • [23] Speaker Independent Single Channel Source Separation Using Sinusoidal Features
    Ranjan, Shivesh
    Payton, Karen L.
    Mowlaee, Pejman
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1522 - 1525
  • [24] Speaker overlap detection with prosodic features for speaker diarisation
    Zelenak, M.
    Hernando, J.
    IET SIGNAL PROCESSING, 2012, 6 (08) : 798 - 804
  • [25] Robust speaker change detection
    Ajmera, J
    McCowan, L
    Bourlard, H
    IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (08) : 649 - 651
  • [26] Automatic speaker change detection with the Bayesian Information Criterion using MPEG-7 features and a fusion scheme
    Kotti, Margarita
    Benetos, Emmanouil
    Kotropoulos, Constantine
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 1856 - +
  • [27] Change detection using object features
    Niemeyer, I.
    Marpu, P.R.
    Nussbaum, S.
    Lecture Notes in Geoinformation and Cartography, 2008, 0 (9783540770572): : 185 - 201
  • [28] Change detection using the object features
    Niemeyer, Irmgard
    Marpu, Prashanth Reddy
    Nussbaum, Sven
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 2374 - +
  • [29] Unsupervised speaker change detection using Probabilistic pattern matching
    Malegaonkar, A.
    Ariyaeeinia, A.
    Sivakumaran, P.
    Fortuna, J.
    IEEE SIGNAL PROCESSING LETTERS, 2006, 13 (08) : 509 - 512
  • [30] SPEAKER CHANGE POINT DETECTION USING DEEP NEURAL NETS
    Gupta, Vishwa
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4420 - 4424