Experimental Case Study of Self-Supervised Learning for Voice Spoofing Detection

被引:1
|
作者
Lee, Yerin [1 ]
Kim, Narin [1 ]
Jeong, Jaehong [2 ,3 ]
Kwak, Il-Youp [1 ]
机构
[1] Chung Ang Univ, Dept Appl Stat, Seoul 06974, South Korea
[2] Hanyang Univ, Dept Math, Seoul 04763, South Korea
[3] Hanyang Univ, Res Inst Nat Sci, Seoul 04763, South Korea
来源
IEEE ACCESS | 2023年 / 11卷
基金
新加坡国家研究基金会;
关键词
Self-supervised learning; Task analysis; Supervised learning; Speech processing; Deep learning; Training; Microphones; Spoofing detection; self-supervised learning; contrastive learning;
D O I
10.1109/ACCESS.2023.3254880
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study aims to improve the performance of voice spoofing attack detection through self-supervised pre-training. Supervised learning needs appropriate input variables and corresponding labels for constructing the machine learning models that are to be applied. It is necessary to secure a large number of labeled datasets to improve the performance of supervised learning processes. However, labeling requires substantial inputs of time and effort. One of the methods for managing this requirement is self-supervised learning, which uses pseudo-labeling without the necessity for substantial human input. This study experimented with contrastive learning, a well-performing self-supervised learning approach, to construct a voice spoofing detection model. We applied MoCo's dynamic dictionary, SimCLR's symmetric loss, and COLA's bilinear similarity in our contrastive learning framework. Our model was trained using VoxCeleb data and voice data extracted from YouTube videos. Our self-supervised model improved the performance of the baseline model from 6.93% to 5.26% for a logical access (LA) scenario and improved the performance of the baseline model from 0.60% to 0.40% for a physical access (PA) scenario. In the case of PA, the best performance was achieved when random crop augmentation was applied, and in the case of LA, the best performance was obtained when random crop and random shifting augmentations were considered.
引用
收藏
页码:24216 / 24226
页数:11
相关论文
共 50 条
  • [1] Self-supervised Spoofing Audio Detection Scheme
    Jiang, Ziyue
    Zhu, Hongcheng
    Peng, Li
    Ding, Wenbing
    Ren, Yanzhen
    INTERSPEECH 2020, 2020, : 4223 - 4227
  • [2] Self-supervised learning for outlier detection
    Diers, Jan
    Pigorsch, Christian
    STAT, 2021, 10 (01):
  • [3] Adversarial Continual Learning to Transfer Self-Supervised Speech Representations for Voice Pathology Detection
    Park, Dongkeon
    Yu, Yechan
    Katabi, Dina
    Kim, Hong Kook
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 (932-936) : 932 - 936
  • [4] Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion
    Sinha, Souvik
    Dey, Spandan
    Saha, Goutam
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [5] Anomaly Detection on Electroencephalography with Self-supervised Learning
    Xu, Junjie
    Zheng, Yaojia
    Mao, Yifan
    Wang, Ruixuan
    Zheng, Wei-Shi
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 363 - 368
  • [6] Automatic Voice Disorder Detection Using Self-Supervised Representations
    Ribas, Dayana
    Pastor, Miguel A.
    Miguel, Antonio
    Martinez, David
    Ortega, Alfonso
    Lleida, Eduardo
    IEEE ACCESS, 2023, 11 : 14915 - 14927
  • [7] Self-supervised Pre-training with Acoustic Configurations for Replay Spoofing Detection
    Shim, Hye-jin
    Heo, Hee-Soo
    Jung, Jee-weon
    Yu, Ha-Jin
    INTERSPEECH 2020, 2020, : 1091 - 1095
  • [8] Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection
    Kim, Ui-Hyun
    INTERSPEECH 2021, 2021, : 326 - 330
  • [9] Combining Self-supervised Learning and Active Learning for Disfluency Detection
    Wang, Shaolei
    Wang, Zhongyuan
    Che, Wanxiang
    Zhao, Sendong
    Liu, Ting
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (03)
  • [10] Self-supervised representation learning for SAR change detection
    Davis, Eric K.
    Houglund, Ian
    Franz, Douglas
    Allen, Michael
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY XXX, 2023, 12520