Replay Speech Detection Based on Dual-Input Hierarchical Fusion Network

被引:0
|
作者
Hu, Chenlei [1 ]
Zhou, Ruohua [1 ]
Yuan, Qingsheng [2 ]
机构
[1] Beijing Univ Civil Engn & Architecture, Sch Elect & Informat Engn, Beijing 102627, Peoples R China
[2] Natl Comp Network Emergency Response Tech Team Coo, Beijing 100029, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 09期
关键词
anti-spoofing; replay speech detection; HFM; ASVspoof; 2021; PATTERN;
D O I
10.3390/app13095350
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Speech anti-spoofing is a crucial aspect of speaker recognition systems and has received a great deal of attention in recent years. Deep neural networks have achieved satisfactory results in datasets with similar training and testing data distributions, but their generalization ability is limited in datasets with different distributions. In this paper, we proposed a novel dual-input hierarchical fusion network (HFN) to improve the generalization ability of our model. The network had two inputs (the original speech signal and the time-reversed signal), which increased the volume and diversity of the training data. The hierarchical fusion model (HFM) enabled more thorough fusion of information from different input levels and improved model performance by fusing the two inputs after speech feature extraction. We finally evaluated the results using the ASVspoof 2021 PA (Physical Access) dataset, and the proposed system achieved an Equal Error Rate (EER) of 24.46% and a minimum tandem Detection Cost Function (min t-DCF) of 0.6708 in the test set. Compared with the four baseline systems in the ASVspoof 2021 competition, the proposed system min t-DCF values were decreased by 28.9%, 31.0%, 32.6%, and 32.9%, and the EERs were decreased by 35.7%, 38.1%, 45.4%, and 49.7%, respectively.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] DDNSR: a dual-input degradation network for real-world super-resolution
    Li, Yizhi
    Chen, Haixin
    Li, Tao
    Liu, Binbing
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 875 - 888
  • [22] DDNSR: a dual-input degradation network for real-world super-resolution
    Yizhi Li
    Haixin Chen
    Tao Li
    Binbing Liu
    Pattern Analysis and Applications, 2023, 26 (3) : 875 - 888
  • [23] Hierarchical Feature Fusion based Reconstruction Network for Unsupervised Anomaly Detection
    Zhao, Binjie
    Nie, Jiahao
    Guan, Siwei
    Wang, Han
    He, Zhiwei
    Gao, Mingyu
    2022 IEEE 27TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2022,
  • [24] Effectiveness of Speech Demodulation-Based Features for Replay Detection
    Kamble, Madhu R.
    Tak, Hemlata
    Patil, Hemant A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 641 - 645
  • [25] Dual-Input Neural Network Integrating Feature Extraction and Deep Learning for Coronary Artery Disease Detection Using Electrocardiogram and Phonocardiogram
    Li, Han
    Wang, Xinpei
    Liu, Changchun
    Wang, Yan
    Li, Peng
    Tang, Hong
    Yao, Lianke
    Zhang, Huan
    IEEE ACCESS, 2019, 7 : 146457 - 146469
  • [26] Dual-input concave diffraction grating demultiplexer based on dielectric multidirectional reflectors
    Mao, Yuzheng
    Zhu, Jingping
    Li, Ke
    Zhang, Yang
    Hou, Xun
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2019, 36 (09) : 1585 - 1590
  • [27] Fusion Network for Multimodal Hate Speech Detection
    Duong, Phuc H.
    Nguyen, Truc T.
    Nguyen, Hien T.
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2024, 2024, : 386 - 390
  • [28] Fault pattern recognition of rolling bearing based on smoothness prior approach and dual-input depth spatial-temporal fusion
    Zhang, M.
    Li, X. J.
    Xu, S. H.
    Meng, X. Y.
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2022, 33 (08)
  • [29] Dual-input convolutional neural network for graphical features based remaining useful life prognosticating of wind turbine bearings
    Yu P.
    Cao J.
    Taiyangneng Xuebao/Acta Energiae Solaris Sinica, 2022, 43 (05): : 343 - 350
  • [30] DFAR-Net: Dual-Input Three-Branch Attention Fusion Reconstruction Network for Polarized Non-Line-of-Sight Imaging
    Liu, Hao
    Wang, Pengfei
    He, Xin
    Wang, Ke
    Jin, Shaohu
    Chen, Pengyun
    Jiang, Xiaoheng
    Xu, Mingliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 41 - 52