Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features

被引:8
|
作者
Williams, Jennifer [1 ]
Rownicka, Joanna [1 ]
机构
[1] Univ Edinburgh, Ctr Speech Technol Res, Edinburgh, Midlothian, Scotland
来源
基金
英国工程与自然科学研究理事会;
关键词
automatic speaker verification; spoofing countermeasures; speech replay detection; NOISE;
D O I
10.21437/Interspeech.2019-1760
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present our system submission to the ASVspoof 2019 Challenge Physical Access (PA) task. The objective for this challenge was to develop a countermeasure that identifies speech audio as either bona fide or intercepted and replayed. The target prediction was a value indicating that a speech segment was bona fide (positive values) or "spoofed" (negative values). Our system used convolutional neural networks (CNNs) and a representation of the speech audio that combined x-vector attack embeddings with signal processing features. The x-vector attack embeddings were created from mel-frequency cepstral coefficients (MFCCs) using a time-delay neural network (TDNN). These embeddings jointly modeled 27 different environments and 9 types of attacks from the labeled data. We also used sub-band spectral centroid magnitude coefficients (SCMCs) as features. We included an additive Gaussian noise layer during training as a way to augment the data to make our system more robust to previously unseen attack examples. We report system performance using the tandem detection cost function (tDCF) and equal error rate (EER). Our approach performed better that both of the challenge baselines. Our technique suggests that our x-vector attack embeddings can help regularize the CNN predictions even when environments or attacks are more challenging.
引用
收藏
页码:1053 / 1057
页数:5
相关论文
共 50 条
  • [41] Detection of Laughter in Children's Speech Using Spectral and Prosodic Acoustic Features
    Rao, Hrishikesh
    Kim, Jonathan C.
    Rozga, Agata
    Clements, Mark A.
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1398 - 1402
  • [42] Low-Complexity Speech Spoofing Detection using Instantaneous Spectral Features
    Sankar, M. S. Arun
    De Leon, Phillip L.
    Sandoval, Steven
    Roedig, Utz
    2022 29TH INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING (IWSSIP), 2022,
  • [43] Fisher Vector Encoding of Dense-BSIF Features for Unknown Face Presentation Attack Detection
    Gonzalez-Soler, Lazaro J.
    Gomez-Barrero, Marta
    Busch, Christoph
    2020 INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG), 2020, P-306
  • [44] Use of Machine Learning for Deception Detection From Spectral and Cepstral Features of Speech Signals
    Fernandes, Sinead V.
    Ullah, Muhammad S.
    IEEE ACCESS, 2021, 9 : 78925 - 78935
  • [45] Detection of the common cold from speech signals using transformer model and spectral features
    Warule, Pankaj
    Chandratre, Snigdha
    Mishra, Siba Prasad
    Deb, Suman
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 93
  • [46] Heart Rate Detection and Classification from Speech Spectral Features Using Machine Learning
    Usman, Mohammed
    Zubair, Mohammed
    Ahmad, Zeeshan
    Zaidi, Monji
    Ijyas, Thafasal
    Parayangat, Muneer
    Wajid, Mohd
    Shiblee, Mohammad
    Ali, Syed Jaffar
    ARCHIVES OF ACOUSTICS, 2021, 46 (01) : 41 - 53
  • [47] Depression detection based on linear and nonlinear speech features in I-vector/SVDA framework
    Mobram, Shamim
    Vali, Mansour
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149
  • [48] FUSION OF MODULATION SPECTRAL AND SPECTRAL FEATURES WITH SYMPTOM METADATA FOR IMPROVED SPEECH-BASED COVID-19 DETECTION
    Zhu, Yi
    Falk, Tiago H.
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8997 - 9001
  • [49] EEG Signal Description with Spectral-Envelope-Based Speech Recognition Features for Detection of Neonatal Seizures
    Temko, Andriy
    Nadeu, Climent
    Marnane, William
    Boylan, Geraldine B.
    Lightbody, Gordon
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2011, 15 (06): : 839 - 847
  • [50] Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson's Disease Detection and Speech Features Extraction
    Klempir, Ondrej
    Krupicka, Radim
    SENSORS, 2024, 24 (17)