Capture inter-speaker information with a neural network for speaker identification

被引:0
|
作者
Wang, L [1 ]
Chen, K [1 ]
Chi, HH [1 ]
机构
[1] Peking Univ, Natl Lab Machine Percept, Beijing 100871, Peoples R China
关键词
D O I
10.1109/IJCNN.2000.861465
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many speaker identification systems are created by model-based approaches, where a statistical model is used to characterize speaker's voice and no inter-speaker information is used in parameter estimation. It is well known that inter-speaker information is very helpful in discrimination of different speakers. In this paper, we propose a novel method for the use of inter-speaker information to improve performance of a model-based speaker identification system. A neural network is employed to capture inter-speaker information from output space of those statistical models. In order to sufficiently utilize inter-speaker information, a rival penalized encoding rule is proposed to design supervised learning pairs for training the neural network. Comparative results in the KING speech corpus show that our method leads to a considerable improvement for a model-based speaker identification system.
引用
收藏
页码:247 / 252
页数:6
相关论文
共 50 条
  • [31] Priority ordered BP neural network and the application for speaker identification
    Deng, HJ
    Du, LM
    Wang, SJ
    2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 671 - 674
  • [32] Voice conversion based on probabilistic parameter transformation and extended inter-speaker residual prediction
    Hanzlicek, Zdenek
    Matousek, Jindrich
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 480 - 487
  • [33] Towards Speaker Identification System based on Dynamic Neural Network
    Ivanovas, E.
    Navakauskas, D.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2012, 18 (10) : 69 - 72
  • [34] Speaker Identification Using Robust Speech Detection and Neural Network
    Ouzounov, Atanas
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2007, 7 (03) : 48 - 54
  • [36] Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation
    Li, Sheng
    Lu, Xugang
    Akita, Yuya
    Kawahara, Tatsuya
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2892 - 2896
  • [37] Normal non-fluency in adult males: An intra-and inter-speaker study
    Duckworth, M. S.
    McDougall, K.
    10TH OXFORD DYSFLUENCY CONFERENCE, ODC 2014, 2015, 193 : 302 - 303
  • [38] Inter-speaker synchronization in audiovisual database for lip-readable speech to animation conversion
    Feldhoffer, Gergely
    Oroszi, Balazs
    Takacs, Gyoergy
    Tihanyi, Attila
    Bardi, Tamas
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2007, 4629 : 447 - 454
  • [39] Intra- and inter-speaker variations of formant pattern for lateral syllables in Standard Chinese
    Zhang, C
    van de Weijer, J
    Cui, JX
    FORENSIC SCIENCE INTERNATIONAL, 2006, 158 (2-3) : 117 - 124
  • [40] LEARNING SPEAKER REPRESENTATION FOR NEURAL NETWORK BASED MULTICHANNEL SPEAKER EXTRACTION
    Zmolikova, Katerina
    Delcroix, Marc
    Kinoshita, Keisuke
    Higuchi, Takuya
    Ogawa, Atsunori
    Nakatani, Tomohiro
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 8 - 15