ESTIMATION OF THE INVARIANT AND VARIANT CHARACTERISTICS IN SPEECH ARTICULATION AND ITS APPLICATION TO SPEAKER IDENTIFICATION

被引:0
|
作者
Prasad, Abhay [1 ]
Periyasamy, Vijitha [2 ]
Ghosh, Prasanta Kumar [2 ]
机构
[1] Manipal Inst Technol, Manipal 576104, Karnataka, India
[2] Indian Inst Sci IISc, Dept Elect Engn, Bangalore 560012, Karnataka, India
来源
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年
关键词
speech articulation; invariant gestures; speaker identification; FEATURES; PURSUIT;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech articulation varies across speakers for producing a speech sound due to the differences in their vocal tract morphologies, though the speech motor actions are executed in terms of relatively invariant gestures [1]. While the invariant articulatory gestures are driven by the linguistic content of the spoken utterance, the component of speech articulation that varies across speakers reflects speaker-specific and other paralinguistic information. In this work, we present a formulation to decompose the speech articulation from multiple speakers into the variant and invariant aspects when they speak the same sentence. The variant component is found to be a better representation for discriminating speakers compared to the speech articulation which includes the invariant part. Experiments with real-time magnetic resonance imaging (rtMRI) videos of speech production from multiple speakers reveal that the variant component of speech articulation yields a better frame-level speaker identification accuracy compared to the speech articulation as well as acoustic features by 29.9% and 9.4% (absolute) respectively.
引用
收藏
页码:4265 / 4269
页数:5
相关论文
共 50 条
  • [21] Robust acoustic domain identification with its application to speaker diarization
    Kumar A.K.
    Waldekar S.
    Sahidullah M.
    Saha G.
    International Journal of Speech Technology, 2022, 25 (04) : 933 - 945
  • [22] Temporal correlation based speech feature processing and its application to speaker recognition
    Xiaofei Xie
    ChengGong Yu
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 1074 - +
  • [23] THE HIDDEN MARKOV MODEL OF CO-ARTICULATION AND ITS APPLICATION TO THE CONTINUOUS SPEECH RECOGNITION
    Lee Tranzai Zheng Fang Wu Wenhu Chen Daowen(Speech Lab.
    Journal of Electronics(China), 2000, (03) : 242 - 247
  • [24] Automatic estimation of the first three subglottal resonances from adults' speech signals with application to speaker height estimation
    Arsikere, Harish
    Leung, Gary K. F.
    Lulich, Steven M.
    Alwan, Abeer
    SPEECH COMMUNICATION, 2013, 55 (01) : 51 - 70
  • [25] Codebook design using genetic algorithm and its application to speaker identification
    Zhang, L
    Zheng, B
    Yang, Z
    ELECTRONICS LETTERS, 2005, 41 (10) : 619 - 620
  • [26] Speaker Identification and Its Application to Social Network Construction for Chinese Novels
    Jia, Yuxiang
    Dou, Huayi
    Cao, Shuai
    Zan, Hongying
    2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020), 2020, : 13 - 18
  • [27] Automatic speaker verification from affective speech using Gaussian mixture model based estimation of neutral speech characteristics
    Avila, Anderson R.
    O'Shaughnessy, Douglas
    Falk, Tiago H.
    SPEECH COMMUNICATION, 2021, 132 : 21 - 31
  • [28] Speaker identification and its application in automobile industry for automatic seat adjustment
    Sumit Srivastava
    Mahesh Chandra
    G. Sahoo
    Microsystem Technologies, 2019, 25 : 2339 - 2347
  • [29] Speaker identification and its application in automobile industry for automatic seat adjustment
    Srivastava, Sumit
    Chandra, Mahesh
    Sahoo, G.
    MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2019, 25 (06): : 2339 - 2347
  • [30] Variance-based filtering model and its application to speaker identification
    Qi, HW
    Guan, Y
    Liu, WJ
    Wang, J
    MACHINE LEARNING FOR SIGNAL PROCESSING XIV, 2004, : 285 - 294