Application of the modified group delay function to speaker identification and discrimination

被引：0

作者：

Hegde, RM ^{[1
]}

Murthy, HA ^{[1
]}

Rao, GVR ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Comp Sci & Engn, Madras, Tamil Nadu, India

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we explore new methods by which speakers can be identified and discriminated, using features derived from the fourier transform phase. The Modified Group Delay Feature(MODGDF) which is a parameterized form of the modified group delay function is used as a front end feature in this study. A Gaussian mixture model(GMM) based speaker identification system is built with the MODGDF as the front end feature. The system is tested on both clean (TIMIT) and noisy telephone(NTIMIT) speech. The results obtained are compared with traditional Mel frequency cepstral coefficients(MFCC) which is derived from the fourier transform magnitude. When both MFCC and MODGDF were combined, the performance improved by about 4% indicating that both phase and magnitude contain complementary information. In an earlier paper [1], it was shown that the MODGDF does possess phoneme specific characteristics. In this paper we show that the MODGDF has speaker specific properties. We also make an attempt to understand speaker discriminating characteristics of the MODGDF using the nonlinear mapping technique based on Sammon mapping [10] and find that the MODGDF empirically demonstrates a certain level of linear separability among speakers.

引用

页码：517 / +

页数：2

共 50 条

[11] Group Delay Based Methods for Speaker Segregation and its Application in Multimedia Information Retrieval
Nathwani, Karan
Pandit, Pranav
Hegde, Rajesh M.
IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (06) : 1326 - 1339
[12] An improved MMSE estimator based modified group delay spectrum for Forensic Automatic Speaker Recognition
Salim Djeghiour
Mhania Guerti
International Journal of Speech Technology, 2021, 24 : 687 - 699
[13] Forensic application of speaker identification
Drʇghicescu, Dragoş
UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2015, 77 77 (3 3): : 107 - 122
[14] FORENSIC APPLICATION OF SPEAKER IDENTIFICATION
Draghicescu, Dragos
UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2015, 77 (03): : 107 - 122
[15] Application of Multivariate Membership Function Discrimination Method for Lithology Identification
Zhao, Jun
Wang, Feifei
Lu, Yifan
SAINS MALAYSIANA, 2017, 46 (11): : 2223 - 2229
[16] LS Regularization of Group Delay Features for Speaker Recognition
Kua, Jia Min Karen
Epps, Julien
Ambikairajah, Eliathamby
Choi, Eric
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2851 - +
[17] Speaker identification using time-delay HMEs
Chen, K
Xie, DH
Chi, HS
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 1996, 7 (01) : 29 - 43
[18] Application of Discriminant Function Analysis in Ischemic Stroke Group Level Discrimination
Omar, W. R. W.
Taib, M. N.
Jailani, R.
Mohamad, Z.
Jahidin, A. H.
Sharif, Z.
2014 IEEE 10TH INTERNATIONAL COLLOQUIUM ON SIGNAL PROCESSING & ITS APPLICATIONS (CSPA 2014), 2014, : 229 - 232
[19] Speaker identification based on modified polynomial classifier
Zhang, XY
Wu, JP
Zhang, QS
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3178 - 3182
[20] Speaker identification based on a modified Kohonen network
Vieira, K
Wilamowski, B
Kubichek, R
1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2103 - 2106

← 1 2 3 4 5 →