Discriminative features based on modified log magnitude spectrum for playback speech detection

被引:0
|
作者
Jichen Yang
Longting Xu
Bo Ren
Yunyun Ji
机构
[1] Department of Electrical and Computer Engineering,
[2] National University of Singapore,undefined
[3] College of Information Science and Technology,undefined
[4] Donghua University,undefined
[5] Microsoft Search Technology Center Asia,undefined
[6] Electronics and Information School,undefined
[7] Nantong University,undefined
关键词
Discriminative feature; Playback attack detection; Modified log magnitude spectrum; Constant-Q variance-based octave coefficients; Constant-Q mean-based octave coefficients;
D O I
暂无
中图分类号
学科分类号
摘要
In order to improve the performance of hand-crafted features to detect playback speech, two discriminative features, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients, are proposed for playback speech detection in this work. They rely on our findings that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can enhance the discriminative power between genuine speech and playback speech. Then constant-Q variance-based octave coefficients (constant-Q mean-based octave coefficients) can be obtained by combining variance-based modified log magnitude spectrum (mean-based modified log magnitude spectrum), octave segmentation, and discrete cosine transform. Finally, constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients are evaluated on ASVspoof 2017 corpus version 2.0 and ASVspoof 2019 physical access, respectively. Experimental results show that variance-based modified log magnitude spectrum and mean-based modified log magnitude spectrum can produce discriminative features toward playback speech. Further results on the two databases show that constant-Q variance-based octave coefficients and constant-Q mean-based octave coefficients can perform better than some common features, such as mel frequency cepstral coefficients and constant-Q cepstral coefficients.
引用
收藏
相关论文
共 50 条
  • [1] Discriminative features based on modified log magnitude spectrum for playback speech detection
    Yang, Jichen
    Xu, Longting
    Ren, Bo
    Ji, Yunyun
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
  • [2] Playback speech detection based on magnitude-phase spectrum
    Yang, Jichen
    Liu, Leian
    ELECTRONICS LETTERS, 2018, 54 (14) : 901 - 902
  • [3] Discriminative feature based on FWMW for playback speech detection
    Yang, Jichen
    Liu, Leian
    He, Qianhua
    ELECTRONICS LETTERS, 2019, 55 (15) : 861 - 863
  • [4] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Jeih-weih Hung
    Hao-teng Fan
    Wen-hsiang Tu
    EURASIP Journal on Advances in Signal Processing, 2012
  • [5] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Hung, Jeih-weih
    Fan, Hao-teng
    Tu, Wen-hsiang
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012,
  • [6] Postfiltering Using Log-Magnitude Spectrum for Speech and Audio Coding
    Das, Sneha
    Backstrom, Tom
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3543 - 3547
  • [7] Simultaneous Speech Detection and Magnitude Squared Spectrum Estimation Approach for Speech Enhancement
    Han, Ruirui
    Ou, Shifeng
    Liu, Wei
    Chen, Chen
    Zhang, Shuo
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 281 - 285
  • [8] Detection of stress and emotion in speech using traditional and FFT based log energy features
    Nwe, TL
    Foo, SW
    De Silva, LC
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1619 - 1623
  • [9] Speech Enhancement Based on Noise Compensated Magnitude Spectrum
    Islam, Md. T.
    Hussain, A. B.
    Shahid, K. T.
    Saha, U.
    Shahnaz, C.
    2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
  • [10] Discriminative auditory-based features for robust speech recognition
    Mak, BKW
    Tam, YC
    Li, PQ
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (01): : 27 - 36