ESTIMATION OF THE INVARIANT AND VARIANT CHARACTERISTICS IN SPEECH ARTICULATION AND ITS APPLICATION TO SPEAKER IDENTIFICATION

被引:0
|
作者
Prasad, Abhay [1 ]
Periyasamy, Vijitha [2 ]
Ghosh, Prasanta Kumar [2 ]
机构
[1] Manipal Inst Technol, Manipal 576104, Karnataka, India
[2] Indian Inst Sci IISc, Dept Elect Engn, Bangalore 560012, Karnataka, India
关键词
speech articulation; invariant gestures; speaker identification; FEATURES; PURSUIT;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.
引用
收藏
页码:4265 / 4269
页数:5
相关论文
共 50 条
  • [1] Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech
    Sarma, Biswajit Dev
    Das, Rohan Kumar
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 610 - 615
  • [2] Application of formant instantaneous characteristics to speech recognition and speaker identification
    侯丽敏
    胡晓宁
    谢娟敏
    Advances in Manufacturing, 2011, (02) : 123 - 127
  • [3] VARIANT AND INVARIANT CHARACTERISTICS OF SPEECH MOVEMENTS
    GRACCO, VL
    ABBS, JH
    EXPERIMENTAL BRAIN RESEARCH, 1986, 65 (01) : 156 - 166
  • [4] A forward masking auditory model and its application in speaker identification and speech recognition
    Liu, ZM
    Wu, XH
    Zhen, B
    Chi, HS
    CHINESE JOURNAL OF ELECTRONICS, 2001, 10 (02): : 196 - 199
  • [5] Iterative PSF Estimation and Its Application to Shift Invariant and Variant Blur Reduction
    Jung, Seung-Won
    Choi, Byeong-Doo
    Ko, Sung-Jea
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2009,
  • [6] Iterative PSF Estimation and Its Application to Shift Invariant and Variant Blur Reduction
    Seung-Won Jung
    Byeong-Doo Choi
    Sung-Jea Ko
    EURASIP Journal on Advances in Signal Processing, 2009
  • [7] Tree-Based Estimation of Speaker Characteristics for Speech Recognition
    Blomberg, Mats
    Elenius, Daniel
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 584 - 587
  • [8] Estimation of Place of Articulation of Fricatives from Spectral Characteristics for Speech Training
    Nataraj, K. S.
    Pandey, Prem C.
    Dasgupta, Hirak
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 339 - 343
  • [9] Speech representation based on tensor factor analysis and its application to speaker recognition and language identification
    Saito, Daisuke
    Suzuki, So
    Minematsu, Nobuaki
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 402 - 406
  • [10] Robust speech features based on wavelet transform with application to speaker identification
    Hsieh, CT
    Lai, E
    Wang, YC
    IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (02): : 108 - 114