Application of the modified group delay function to speaker identification and discrimination

被引：0

作者：

Hegde, RM ^{[1
]}

Murthy, HA ^{[1
]}

Rao, GVR ^{[1
]}

机构：

[1] Indian Inst Technol, Dept Comp Sci & Engn, Madras, Tamil Nadu, India

来源：

2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING | 2004年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we explore new methods by which speakers can be identified and discriminated, using features derived from the fourier transform phase. The Modified Group Delay Feature(MODGDF) which is a parameterized form of the modified group delay function is used as a front end feature in this study. A Gaussian mixture model(GMM) based speaker identification system is built with the MODGDF as the front end feature. The system is tested on both clean (TIMIT) and noisy telephone(NTIMIT) speech. The results obtained are compared with traditional Mel frequency cepstral coefficients(MFCC) which is derived from the fourier transform magnitude. When both MFCC and MODGDF were combined, the performance improved by about 4% indicating that both phase and magnitude contain complementary information. In an earlier paper [1], it was shown that the MODGDF does possess phoneme specific characteristics. In this paper we show that the MODGDF has speaker specific properties. We also make an attempt to understand speaker discriminating characteristics of the MODGDF using the nonlinear mapping technique based on Sammon mapping [10] and find that the MODGDF empirically demonstrates a certain level of linear separability among speakers.

引用

页码：517 / +

页数：2

共 50 条

[41] Compensating Function of Formant Instantaneous Characteristics in Speaker Identification
Hou, Limin
Xie, Juanmin
FIFTH INTERNATIONAL CONFERENCE ON INFORMATION ASSURANCE AND SECURITY, VOL 1, PROCEEDINGS, 2009, : 744 - 747
[42] Group Sparsity for Speaker Identity Discrimination in Factorisation-based Speech Recognition
Hurmalainen, Antti
Saeidi, Rahim
Virtanen, Tuomas
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2135 - 2138
[43] MODIFIED FOURIER'S LAW WITH TIME-DELAY AND KERNEL FUNCTION: APPLICATION IN THERMOELASTICITY
El-Karamany, Ahmed S.
Ezzat, Magdy A.
JOURNAL OF THERMAL STRESSES, 2015, 38 (07) : 811 - 834
[44] Evaluating the effects of modified speech on perceptual speaker identification performance
O'Brien, Benjamin
Meunier, Christine
Ghio, Alain
INTERSPEECH 2022, 2022, : 3073 - 3077
[45] A modified HME architecture for text-dependent speaker identification
Chen, K
Xie, DH
Chi, HS
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1996, 7 (05): : 1309 - 1313
[46] Speaker Identification for Disguised Voices Based on Modified SVM Classifier
Al Hindawi, Noor Ahmad
Shahin, Ismail
Nassif, Ali Bou
2021 18TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2021, : 687 - 691
[47] Group delay-based minimum variance distortion-less response cepstral features for speaker identification in whispered speech
Sardar, Vijay M.
Jadhav, Manisha L.
Deshmukh, Saurabh H.
Jadhav, Makarand M.
INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2023, 73 (02) : 104 - 112
[48] Modified HME architecture for text-dependent speaker identification
Peking Univ, Beijing, China
IEEE Trans Neural Networks, 5 (1309-1313):
[49] Modified Group Delay Features for Emotion Recognition
Uthiraa, S.
Pusuluri, Aditya
Patil, Hemant A.
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023, 2023, 14301 : 321 - 330
[50] Speaker Identification Based on Integrated Face Direction in a Group Conversation
Ienaga, Naoto
Ozasa, Yuko
Saito, Hideo
2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2017, : 53 - 57

← 1 2 3 4 5 →