Novel Demodulation-Based Features using Classifier-level Fusion of GMM and CNN for Replay Detection

被引:0
|
作者
Kamble, Madhu R. [1 ]
Tak, Hemlata [1 ]
Krishna, Maddala V. Siva [2 ]
Patil, Hemant A. [1 ]
机构
[1] DA IICT, Speech Res Lab, Gandhinagar, Gujarat, India
[2] IIIT, Vadodara, Gujarat, India
关键词
Automatic speaker verification; spoof; replay; demodulation techniques; convolutional neural network; SPEAKER VERIFICATION; INSTANTANEOUS FREQUENCY; ENERGY SEPARATION; COUNTERMEASURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we explore the use of Convolutional Neural Networks (CNN) for replay spoof detection in Automatic Speaker Verification (ASV) system. The Amplitude and Frequency Modulation (AM-FM) feature sets obtained from the Hilbert transform (HT) and Energy Separation Algorithm (ESA) are used as the front end. We have observed the effect of max-pooling and fully connected (FC) layers, when replaced with the convolutional layers in CNN. The results are compared with Gaussian Mixture Model (GMM) classifier, furthermore to obtain the possible complementary information of both the GMM and CNN classifiers, we have explored classifier-level fusion. In addition, we compared our results with Constant-Q Cepstral Coefficients (CQCC) and Mel Frequency Cepstral Coefficients (MFCC) feature sets. The architecture with max-pooling when replaced with convolutional layer along with FC layers had performed relatively better on most of the AM-FM feature sets compared to other CNNs. The ESA-based AM features (i.e., Instantaneous Amplitude Cosine Coefficients (ESA-IACC)) performed better as AM do not have more fluctuation as FM have during models training. The lower EER is obtained with classifier-level fusion of ESA-IACC feature set resulting in 2.54 % EER on development set and 6.04 % on evaluation set of ASVspoof 2017 Challenge database.
引用
收藏
页码:334 / 338
页数:5
相关论文
共 50 条
  • [31] Hand Gesture Recognition using PCA based Deep CNN Reduced Features and SVM classifier
    Sahoo, Jaya Prakash
    Ari, Samit
    Patra, Sarat Kumar
    2019 IEEE INTERNATIONAL SYMPOSIUM ON SMART ELECTRONIC SYSTEMS (ISES 2019), 2019, : 221 - 224
  • [32] Novel Variable Length Energy Separation Algorithm using Instantaneous Amplitude Features For Replay Detection
    Kamble, Madhu R.
    Patil, Hemant A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 646 - 650
  • [33] Novel Variable Length Teager Energy Separation Based Instantaneous Frequency Features for Replay Detection
    Patil, Hemant A.
    Kamble, Madhu R.
    Patel, Tanvina B.
    Soni, Meet
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 12 - 16
  • [34] An automatic breast computer-aided diagnosis scheme based on a weighted fusion of relevant features and a deep CNN classifier
    Gargouri, Norhene
    Mokni, Raouia
    Damak, Alima
    Sellami, Dorra
    Abid, Riadh
    IET IMAGE PROCESSING, 2022, 16 (12) : 3394 - 3406
  • [35] Dense Hand-CNN: A Novel CNN Architecture based on Later Fusion of Neural and Wavelet Features for Identity Recognition
    Elgallad, Elaraby A.
    Ouarda, Wael
    Alimi, Adel M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (06) : 368 - 378
  • [36] CNN-based Machine Vision Classifier using Channel-Wise Fusion of Multiple Illuminations
    Lee H.
    Hwang Y.
    Journal of Institute of Control, Robotics and Systems, 2022, 28 (10) : 855 - 861
  • [37] SMoGW-based deep CNN: Plant disease detection and classification using SMoGW-deep CNN classifier
    Pahurkar, Archana Buddham
    Deshmukh, Ravindra Madhukarrao
    WEB INTELLIGENCE, 2024, 22 (02) : 209 - 230
  • [38] Improved text overlay detection in videos using a fusion-based classifier
    Tseng, BL
    Lin, CY
    Zhang, DQ
    Smith, JR
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 473 - 476
  • [39] Combining wavelet transforms features and high-level features using CNN for face morphing attack detection
    Razaq I.S.
    Shukur B.K.
    International Journal of Information Technology, 2023, 15 (7) : 3957 - 3966
  • [40] Diabetic retinopathy detection using ensembled transfer learning based thrice CNN with SVM classifier
    Thomas, Neetha Merin
    Jerome, S. Albert
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (27) : 70089 - 70115