SPEAKER GENDER IDENTIFICATION IN MATCHED AND MISMATCHED CONDITIONS BASED ON STACKING ENSEMBLE METHOD

被引:0
|
作者
Badr, Ameer A. [1 ,2 ]
Abdul-Hassan, Alia K. [2 ]
机构
[1] Imam Jaafar Al Sadiq Univ, Coll Managerial & Financial Sci, Salahaddin, Iraq
[2] Univ Technol Baghdad, Dept Comp Sci, Baghdad, Iraq
来源
关键词
Cross-language; Fractal dimensions; Features fusion; LDA; Speaker gender detection; LOGISTIC-REGRESSION; NAIVE BAYES; RECOGNITION; CLASSIFICATION; SPEECH; AGE;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Identifying the gender of the human voice has been considered one of the challenging tasks because it acts as a pre-processing ingredient for enhancing speech analysis applications. In this work, an automatic system is proposed to identify the speaker's gender without depending on the text in matched and mismatched conditions. Firstly, three groups of features are extracted from each utterance using Fundamental Frequency (F0), Fractal Dimensions, and Mel Frequency Cepstral Coefficient (MFCC) methods. Then, the extracted feature dimensions are reduced using Linear Discriminant Analysis (LDA) method. Finally, the speaker's gender is identified based on proposed stacking ensemble classifier when Logistic Regression (LR), K-Nearest Neighbours (KNN) and Gaussian Naive Bayes (GNB) are used as base classifiers, while Support Vector Machine (SVM) is used as meta classifier. Four experiments are conducted on two datasets: TIMIT, and Common-Voice. In matched conditions (i.e., same language), the proposed system accuracy is 99.74%, 87.28% for the TIMIT, and the Common-Voice dataset, respectively. In mismatched conditions (i.e., cross language), the proposed system shows a high ability to generalize, taking advantage of using the LDA method, where the system accuracy is 81.19%, 97.78% for the (TIMIT\Common-Voice), and (Common-Voice\TIMIT) datasets, respectively. The results also showed a clear superiority for the proposed system in comparison to related works that utilized the TIMIT dataset.
引用
收藏
页码:1119 / 1134
页数:16
相关论文
共 50 条
  • [1] Speaker Identification and Verification from Audio Coded Speech in Matched and Mismatched Conditions
    Jiang, Tao
    Gao, Boyang
    Han, Jiqing
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 2199 - 2204
  • [2] IMPROVING THE PERFORMANCE OF VTLN UNDER MISMATCHED SPEAKER CONDITIONS AND MAKING IT APPROACH THAT OF MATCHED SPEAKER CONDITIONS
    Sanand, D. R.
    Rath, S. P.
    Umesh, S.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4397 - 4400
  • [3] HILBERT ENVELOPE BASED FEATURES FOR ROBUST SPEAKER IDENTIFICATION UNDER REVERBERANT MISMATCHED CONDITIONS
    Sadjadi, Seyed Omid
    Hansen, John H. L.
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5448 - 5451
  • [4] Robust Far-Field Speaker Identification under Mismatched Conditions
    Jin, Qin
    Schultz, Tanja
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1893 - 1896
  • [5] An Auditory-Based Feature Extraction Algorithm for Robust Speaker Identification Under Mismatched Conditions
    Li, Qi
    Huang, Yan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1791 - 1801
  • [6] Stacking Ensemble Learning-based Gender Identification for User Profiling in Smart Education
    Fu, Qiang
    Wen, Yiping
    Tan, Zheng
    Fu, Qi
    [J]. IEEE TALE2021: IEEE INTERNATIONAL CONFERENCE ON ENGINEERING, TECHNOLOGY AND EDUCATION, 2021, : 986 - 991
  • [7] Gender Identification Of The Speaker Using DTW Method
    Yucesoy, Ergun
    Nabiyev, Vasif V.
    [J]. 2009 IEEE 17TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 519 - +
  • [8] Speaker-Specific Utterance Ensemble based Transfer Attack on Speaker Identification
    Zuo, Chu-Xiao
    Leng, Jia-Yi
    Li, Wu-Jun
    [J]. INTERSPEECH 2022, 2022, : 3203 - 3207
  • [9] Intelligent Identification Method for Drilling Conditions Based on Stacking Model Fusion
    Gao, Yonghai
    Yu, Xin
    Su, Yufa
    Yin, Zhiming
    Wang, Xuerui
    Li, Shaoqiang
    [J]. ENERGIES, 2023, 16 (02)
  • [10] An Analysis of the Influence of Acoustical Adverse Conditions on Speaker Gender Identification
    Maka, Tomasz
    Dziurzanski, Piotr
    [J]. 2014 XXII ANNUAL PACIFIC VOICE CONFERENCE (PVC), 2014,