\ COMBINING REGRESSION AND CLASSIFICATION METHODS FOR IMPROVING AUTOMATIC SPEAKER AGE RECOGNITION

被引:21
|
作者
van Heerden, Charl [1 ]
Barnard, Etienne [1 ]
Davel, Marelie [1 ]
van der Walt, Christiaan [1 ]
van Dyk, Ewald [1 ,2 ]
Feld, Michael [2 ]
Mueller, Christian [2 ]
机构
[1] CSIR, Human Language Technol Meraka Inst, Pretoria, South Africa
[2] German Res Ctr AI, Intrlligent User Interface, Berlin, Germany
关键词
Age classification; gender classification; support vector machines;
D O I
10.1109/ICASSP.2010.5495006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present a novel approach to automatic speaker age classification, which combines regression and classification to achieve competitive classification accuracy on telephone speech. Support vector machine regression is used to generate finer age estimates, which are combined with the posterior probabilities of well-trained discriminative gender classifiers to predict both the age and gender of a speaker. We show that this combination performs better than direct 7-class classifiers. The regressors and classifiers are trained using long-term features such as pitch and formants, as well as short-term (frame-based) features derived from MAP adaptation of GMMs that were trained on MFCCs.
引用
收藏
页码:5174 / 5177
页数:4
相关论文
共 50 条
  • [41] Review of Methods for Automatic Speaker Verification
    O'Shaughnessy, Douglas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1776 - 1789
  • [42] Improving Recognition of Speaker States and Traits by Cumulative Evidence: Intoxication, Sleepiness, Age and Gender
    Weninger, Felix
    Marchi, Erik
    Schuller, Bjoern
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1158 - 1161
  • [43] Pattern Recognition with Rejection Combining Standard Classification Methods with Geometrical Rejecting
    Homenda, Wladyslaw
    Jastrzebska, Agnieszka
    Waszkiewicz, Piotr
    Zawadzka, Anna
    COMPUTER INFORMATION SYSTEMS AND INDUSTRIAL MANAGEMENT, CISIM 2016, 2016, 9842 : 589 - 602
  • [44] FORENSICALLY INSPIRED APPROACHES TO AUTOMATIC SPEAKER RECOGNITION
    Han, K. J.
    Omar, M. K.
    Pelecanos, J.
    Pendus, C.
    Yaman, S.
    Zhu, W.
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5160 - 5163
  • [45] AUTOMATIC SPEAKER RECOGNITION BASED ON PITCH CONTOURS
    ATAL, BS
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1972, 52 (06): : 1687 - 1697
  • [46] FM Features for Automatic Forensic Speaker Recognition
    Thiruvaran, Tharmarajah
    Ambikairajah, Eliathamby
    Epps, Julien
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1497 - 1500
  • [47] Deep Speaker Embedding for Speaker-Targeted Automatic Speech Recognition
    Chao, Guan-Lin
    Shen, John Paul
    Lane, Ian
    NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 39 - 43
  • [48] Automatic speaker recognition using statistical measures
    Sayoud, H
    Ouamour-Sayoud, S
    INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 100 - 103
  • [49] Automatic Speaker Recognition for Mobile Forensic Applications
    Algabri, Mohammed
    Mathkour, Hassan
    Bencherif, Mohamed A.
    Alsulaiman, Mansour
    Mekhtiche, Mohamed A.
    MOBILE INFORMATION SYSTEMS, 2017, 2017
  • [50] Time frequency features for automatic speaker recognition
    Shahrood University of Technology, Faculty of Electrical Engineering and Robotics, Shahrood, Iran
    WSEAS Trans. Commun., 2006, 12 (2148-2154):