ESTIMATION OF THE INVARIANT AND VARIANT CHARACTERISTICS IN SPEECH ARTICULATION AND ITS APPLICATION TO SPEAKER IDENTIFICATION

被引：0

作者：

Prasad, Abhay ^{[1
]}

Periyasamy, Vijitha ^{[2
]}

Ghosh, Prasanta Kumar ^{[2
]}

机构：

[1] Manipal Inst Technol, Manipal 576104, Karnataka, India

[2] Indian Inst Sci IISc, Dept Elect Engn, Bangalore 560012, Karnataka, India

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

speech articulation; invariant gestures; speaker identification; FEATURES; PURSUIT;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.

引用

页码：4265 / 4269

页数：5

共 50 条

[31] Parameter optimization for Gaussian mixture model and its application in speaker identification
1600, ICIC Express Letters Office (07):
[32] ROBUST MULTI CHANNEL TDOA ESTIMATION FOR SPEAKER LOCALIZATION USING THE IMPULSIVE CHARACTERISTICS OF SPEECH SPECTRUM
He, Hongsen
Chen, Jingdong
Benesty, Jacob
Zhou, Yingyue
Yang, Tao
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 6130 - 6134
[33] SEMIAUTOMATIC SPEECH SOUNDS AURAL IDENTIFICATION PROCEDURE WITH ITS APPLICATION TO SPEECH ANALYSIS
CHRISTOV, PD
ACUSTICA, 1973, 29 (06): : 347 - 349
[34] A novel affine invariant feature set and its application in motion estimation
Chen, LY
Lu, ZK
Teoh, EK
Xue, Z
2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL III, PROCEEDINGS, 2000, : 612 - 615
[35] Speech Representation Using Linear Chirplet Transform and Its Application in Speaker-Related Recognition
Do, Hao D.
Chau, Duc T.
Tran, Son T.
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 719 - 729
[36] The role of speaker gender identification in relative fundamental frequency height estimation from multispeaker, brief speech segments
Lee, Chao-Yang
Dutton, Lauren
Ram, Gayatri
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 128 (01): : 384 - 388
[37] Laplace entropy and its application to time delay estimation for speech signals
Huang, Yiteng
Benesty, Jacob
Chen, Jingdong
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 113 - +
[38] Feature selection using genetics-based algorithm and its application to speaker identification
METU, Ankara, Turkey
ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (329-332):
[39] HIERARCHICAL MIXTURE CLUSTERING AND ITS APPLICATION TO GMM BASED TEXT INDEPENDENT SPEAKER IDENTIFICATION
Saeidi, R.
Mohammadi, H. R. Sadegh
Ganchev, T.
Rodman, R. D.
2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 770 - +
[40] Feature selection using genetics-based algorithm and its application to speaker identification
Demirekler, M
Haydar, A
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 329 - 332

← 1 2 3 4 5 →