Novel Demodulation-Based Features using Classifier-level Fusion of GMM and CNN for Replay Detection

被引:0
|
作者
Kamble, Madhu R. [1 ]
Tak, Hemlata [1 ]
Krishna, Maddala V. Siva [2 ]
Patil, Hemant A. [1 ]
机构
[1] DA IICT, Speech Res Lab, Gandhinagar, Gujarat, India
[2] IIIT, Vadodara, Gujarat, India
关键词
Automatic speaker verification; spoof; replay; demodulation techniques; convolutional neural network; SPEAKER VERIFICATION; INSTANTANEOUS FREQUENCY; ENERGY SEPARATION; COUNTERMEASURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we explore the use of Convolutional Neural Networks (CNN) for replay spoof detection in Automatic Speaker Verification (ASV) system. The Amplitude and Frequency Modulation (AM-FM) feature sets obtained from the Hilbert transform (HT) and Energy Separation Algorithm (ESA) are used as the front end. We have observed the effect of max-pooling and fully connected (FC) layers, when replaced with the convolutional layers in CNN. The results are compared with Gaussian Mixture Model (GMM) classifier, furthermore to obtain the possible complementary information of both the GMM and CNN classifiers, we have explored classifier-level fusion. In addition, we compared our results with Constant-Q Cepstral Coefficients (CQCC) and Mel Frequency Cepstral Coefficients (MFCC) feature sets. The architecture with max-pooling when replaced with convolutional layer along with FC layers had performed relatively better on most of the AM-FM feature sets compared to other CNNs. The ESA-based AM features (i.e., Instantaneous Amplitude Cosine Coefficients (ESA-IACC)) performed better as AM do not have more fluctuation as FM have during models training. The lower EER is obtained with classifier-level fusion of ESA-IACC feature set resulting in 2.54 % EER on development set and 6.04 % on evaluation set of ASVspoof 2017 Challenge database.
引用
收藏
页码:334 / 338
页数:5
相关论文
共 50 条
  • [1] Effectiveness of Speech Demodulation-Based Features for Replay Detection
    Kamble, Madhu R.
    Tak, Hemlata
    Patil, Hemant A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 641 - 645
  • [2] Speech Demodulation-based Techniques for Replay and Presentation Attack Detection
    Kamble, Madhu R.
    Sai, Pulikonda Aditya Krishna
    Krishna, Maddala V. Siva
    Patil, Ankur T.
    Acharya, Rajul
    Patil, Hemant A.
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1545 - 1550
  • [3] Detection algorithm for pigmented skin disease based on classifier-level and feature-level fusion
    Wan, Li
    Ai, Zhuang
    Chen, Jinbo
    Jiang, Qian
    Chen, Hongying
    Li, Qi
    Lu, Yaping
    Chen, Liuqing
    FRONTIERS IN PUBLIC HEALTH, 2022, 10
  • [4] Speech emotion classification using feature-level and classifier-level fusion
    Mishra, Siba Prasad
    Warule, Pankaj
    Deb, Suman
    EVOLVING SYSTEMS, 2024, 15 (02) : 541 - 554
  • [5] Speech emotion classification using feature-level and classifier-level fusion
    Siba Prasad Mishra
    Pankaj Warule
    Suman Deb
    Evolving Systems, 2024, 15 : 541 - 554
  • [6] Detection algorithm for pigmented skin disease based on classifier-level and feature-level fusion (vol 10, 1034772, 2022)
    Wan, Li
    Ai, Zhuang
    Chen, Jinbo
    Jiang, Qian
    Chen, Hongying
    Li, Qi
    Lu, Yaping
    Chen, Liuqing
    FRONTIERS IN PUBLIC HEALTH, 2023, 11
  • [7] Face recognition system based on CNN and LBP features for classifier optimization and fusion
    Wu Yulin
    Jiang Mingyan
    The Journal of China Universities of Posts and Telecommunications, 2018, 25 (01) : 37 - 47
  • [8] An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy
    Chen, Wenna
    Wang, Yixing
    Ren, Yuhao
    Jiang, Hongwei
    Du, Ganqin
    Zhang, Jincan
    Li, Jinghua
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [9] An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy
    Wenna Chen
    Yixing Wang
    Yuhao Ren
    Hongwei Jiang
    Ganqin Du
    Jincan Zhang
    Jinghua Li
    BMC Medical Informatics and Decision Making, 23
  • [10] Replay Spoof Detection using Power Function Based Features
    Tapkir, Prasad A.
    Kamble, Madhu R.
    Patil, Hemant A.
    Madhavi, Maulik
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1019 - 1023