Speaker change detection in casual conversations using excitation source features

被引:4
|
作者
Dhananjaya, N. [1 ]
Yegnanarayana, B. [2 ]
机构
[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India
[2] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India
关键词
speaker change detection; multispeaker conversation; autoassociative neural network (AANN) models; excitation source features; linear prediction (LP) residual;
D O I
10.1016/j.specom.2007.08.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose a method for speaker change detection using features of excitation source of the speech production mechanism. The method uses neural network models to capture the speaker-specific information from a signal that represents predominantly the excitation source. The focus in this paper is on speaker change detection in casual telephone conversations, in which short (<5 s) speaker turns are common. Excitation source features are a better choice for modeling a speaker, when limited amount of speech data is available, when compared to the vocal tract system features. Linear prediction residual is used as an estimate of the excitation source signal. Autoassociative neural network models are proposed to capture the higher order relations among the samples of the residual signal. Speaker models are generated for every one second of voiced speech from the first few seconds of the conversation. These models are used to detect the speaker change points. Performance of the proposed method for speaker change detection is evaluated on a database containing several two-speaker conversations. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:153 / 161
页数:9
相关论文
共 50 条
  • [1] Speaker Change Detection using Excitation Source and Vocal Tract System Information
    Sarma, Mousmita
    Gadre, Sree Nilendra
    Sarma, Biswajit Dev
    Prasanna, S. R. Mahadeva
    2015 TWENTY FIRST NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2015,
  • [2] Speaker Change Detection using Features through a Neural Network Speaker Classifier
    Ge, Zhenhao
    Iyer, Ananth N.
    Cheluvaraja, Srinath
    Ganapathiraju, Aravind
    PROCEEDINGS OF THE 2017 INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2017, : 1111 - 1116
  • [3] LSTM Neural Network for Speaker Change Detection in Telephone Conversations
    Hruz, Marek
    Hlavac, Miroslav
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 226 - 233
  • [4] Speaker Recognition using Excitation Source Parameters
    Kamarauskas, J.
    Salna, B.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2011, (01) : 55 - 58
  • [5] Speaker verification using excitation source information
    Pati, Debadatta
    Prasanna, S. R. Mahadeva
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (02) : 241 - 257
  • [6] Speaker verification using excitation source information
    Debadatta Pati
    S. R. Mahadeva Prasanna
    International Journal of Speech Technology, 2012, 15 (2) : 241 - 257
  • [7] Excitation Features of Speech for Speaker-Specific Emotion Detection
    Kadiri, Sudarsana Reddy
    Alku, Paavo
    IEEE ACCESS, 2020, 8 (08): : 60382 - 60391
  • [8] Speaker localization using excitation source information in speech
    Raykar, VC
    Yegnanarayana, B
    Prasanna, SRM
    Duraiswami, R
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 751 - 761
  • [9] Detection of replay signals using excitation source and shifted CQCC features
    Dutta, Krishna
    Singh, Madhusudan
    Pati, Debadatta
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (02) : 497 - 507
  • [10] Detection of replay signals using excitation source and shifted CQCC features
    Krishna Dutta
    Madhusudan Singh
    Debadatta Pati
    International Journal of Speech Technology, 2021, 24 : 497 - 507