ESTIMATION OF THE INVARIANT AND VARIANT CHARACTERISTICS IN SPEECH ARTICULATION AND ITS APPLICATION TO SPEAKER IDENTIFICATION

被引：0

作者：

Prasad, Abhay ^{[1
]}

Periyasamy, Vijitha ^{[2
]}

Ghosh, Prasanta Kumar ^{[2
]}

机构：

[1] Manipal Inst Technol, Manipal 576104, Karnataka, India

[2] Indian Inst Sci IISc, Dept Elect Engn, Bangalore 560012, Karnataka, India

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

speech articulation; invariant gestures; speaker identification; FEATURES; PURSUIT;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.

引用

页码：4265 / 4269

页数：5

共 50 条

[1] Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech
Sarma, Biswajit Dev
Das, Rohan Kumar
2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 610 - 615
[2] Application of formant instantaneous characteristics to speech recognition and speaker identification
侯丽敏
胡晓宁
谢娟敏
Advances in Manufacturing, 2011, (02) : 123 - 127
[3] VARIANT AND INVARIANT CHARACTERISTICS OF SPEECH MOVEMENTS
GRACCO, VL
ABBS, JH
EXPERIMENTAL BRAIN RESEARCH, 1986, 65 (01) : 156 - 166
[4] A forward masking auditory model and its application in speaker identification and speech recognition
Liu, ZM
Wu, XH
Zhen, B
Chi, HS
CHINESE JOURNAL OF ELECTRONICS, 2001, 10 (02): : 196 - 199
[5] Iterative PSF Estimation and Its Application to Shift Invariant and Variant Blur Reduction
Jung, Seung-Won
Choi, Byeong-Doo
Ko, Sung-Jea
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
[6] Iterative PSF Estimation and Its Application to Shift Invariant and Variant Blur Reduction
Seung-Won Jung
Byeong-Doo Choi
Sung-Jea Ko
EURASIP Journal on Advances in Signal Processing, 2009
[7] Tree-Based Estimation of Speaker Characteristics for Speech Recognition
Blomberg, Mats
Elenius, Daniel
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 584 - 587
[8] Estimation of Place of Articulation of Fricatives from Spectral Characteristics for Speech Training
Nataraj, K. S.
Pandey, Prem C.
Dasgupta, Hirak
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 339 - 343
[9] Speech representation based on tensor factor analysis and its application to speaker recognition and language identification
Saito, Daisuke
Suzuki, So
Minematsu, Nobuaki
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 402 - 406
[10] Robust speech features based on wavelet transform with application to speaker identification
Hsieh, CT
Lai, E
Wang, YC
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (02): : 108 - 114

← 1 2 3 4 5 →