Person Recognition using Humming, Singing and Speech

被引:4
|
作者
Patil, Hemant A. [1 ]
Madhavi, Maulik C. [1 ]
Chhayani, Nirav H. [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol D, Gandhinagar, India
关键词
Biometric; Humming; Corpus development; Speaker recognition; Singer recognition; SPEAKER RECOGNITION;
D O I
10.1109/IALP.2012.58
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker recognition deals with designing the system which recognizes the person by speech with the help of computers. In this paper, the various biometric signals produced by humans, viz., speech, singing and humming are considered for person recognition task. Corpus has been developed from 28 subjects in real-life settings. For person recognition task, state-of-the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC) and a discriminatively-trained polynomial classifier of 2nd order approximation are used as spectral feature and classification techniques, respectively. Our experimental results indicate that the performance of person recognition system obtained using humming outperforms other biometric patterns (i.e., speech and singing) by 9 % in EER and 9 % in Identification Rate. We believe that this may be due to the person-specific characteristics are better captured in humming sounds, (which are nasalized sounds) than speech and singing.
引用
收藏
页码:149 / 152
页数:4
相关论文
共 50 条
  • [31] Multimodal person authentication using speech, face and visual speech
    Palanivel, S.
    Yegnanarayana, B.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 109 (01) : 44 - 55
  • [32] SPEECH AUGMENTATION USING WAVENET IN SPEECH RECOGNITION
    Wang, Jisung
    Kim, Sangki
    Lee, Yeha
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6770 - 6774
  • [33] Audio-visual speech recognition using convolutive bottleneck networks for a person with severe hearing loss
    Takashima, Yuki
    Kakihara, Yasuhiro
    Aihara, Ryo
    Takiguchi, Tetsuya
    Ariki, Yasuo
    Mitani, Nobuyuki
    Omori, Kiyohiro
    Nakazono, Kaoru
    IPSJ Transactions on Computer Vision and Applications, 2015, 7 : 64 - 68
  • [34] Continuous Pitch Contour as an Improvement Feature for Music Information Retrieval by Humming/Singing
    Tri Nguyen Truong Duc
    Minh Le Nhat
    Ha Nguyen Duc Hoang
    Quan Vu Hai
    PRICAI 2008: TRENDS IN ARTIFICIAL INTELLIGENCE, 2008, 5351 : 1086 - 1091
  • [35] Phonetic Segmentation of Singing Voice using MIDI and Parallel Speech
    Dong, Minghui
    Chan, Paul
    Cen, Ling
    Li, Haizhou
    Teo, Jason
    Kua, Ping Jen
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2890 - +
  • [36] Emotion Recognition using Imperfect Speech Recognition
    Metze, Florian
    Batliner, Anton
    Eyben, Florian
    Polzehl, Tim
    Schuller, Bjoern
    Steidl, Stefan
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 478 - +
  • [37] Speech recognition using fractals
    Bohez, ELJ
    Senevirathne, TR
    PATTERN RECOGNITION, 2001, 34 (11) : 2227 - 2243
  • [38] Speech recognition using SVMs
    Smith, N
    Gales, M
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 14, VOLS 1 AND 2, 2002, 14 : 1197 - 1204
  • [39] Robust Query-by-Singing/Humming System against Background Noise Environments
    Kim, Kichul
    Park, Kang Ryoung
    Park, Sung-Joo
    Lee, Soek-Pil
    Kim, Moo Young
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2011, 57 (02) : 720 - 725
  • [40] Query-by-Humming/Singing of MIDI and Audio Files by Fuzzy Inference System
    Huang, Yo-Ping
    Lai, Shin-Liang
    Chang, Tsun-Wei
    Horng, Maw-Sheng
    2012 THIRD FTRA INTERNATIONAL CONFERENCE ON MOBILE, UBIQUITOUS, AND INTELLIGENT COMPUTING (MUSIC), 2012, : 41 - 46