AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION

被引:1
|
作者
Zhao, Yifei [1 ]
Champagne, Benoit [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Voice activity detection; transformer-based architecture; audio fingerprinting; NOISE;
D O I
10.1109/MLSP55214.2022.9943501
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice Activity Detection (VAD) aims to distinguish, at a given time, between desired speech and non-speech. Although many state-of-the-art approaches for increasing the performance of VAD have been proposed, they are still not robust enough to be applied under adverse noise conditions with low Signal-to-Noise Ratio (SNR). To deal with this issue, we propose a novel transformer-based architecture for VAD with reduced computational complexity by implementing efficient depth-wise convolutions on feature patches. The proposed model, named Tr-VAD, demonstrates better performance compared to baseline methods from the literature in a variety of scenarios considered with the smallest possible number of parameters. The results also indicate that the use of a combination of Audio Fingerprinting (ARP) features with Tr-VAD can guarantee better performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Transformer-based models for multimodal irony detection
    Tomás D.
    Ortega-Bueno R.
    Zhang G.
    Rosso P.
    Schifanella R.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (6) : 7399 - 7410
  • [22] A TRANSFORMER-BASED SIAMESE NETWORK FOR CHANGE DETECTION
    Bandara, Wele Gedara Chaminda
    Patel, Vishal M.
    [J]. 2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 207 - 210
  • [23] A Transformer-Based Framework for Tiny Object Detection
    Liao, Yi-Kai
    Lin, Gong-Si
    Yeh, Mei-Chen
    [J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 373 - 377
  • [24] Vision Transformer-Based Tailing Detection in Videos
    Lee, Jaewoo
    Lee, Sungjun
    Cho, Wonki
    Siddiqui, Zahid Ali
    Park, Unsang
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [25] BlinkLinMulT: Transformer-Based Eye Blink Detection
    Fodor, Adam
    Fenech, Kristian
    Lorincz, Andras
    [J]. JOURNAL OF IMAGING, 2023, 9 (10)
  • [26] A transformer-based approach to irony and sarcasm detection
    Rolandos Alexandros Potamias
    Georgios Siolas
    Andreas - Georgios Stafylopatis
    [J]. Neural Computing and Applications, 2020, 32 : 17309 - 17320
  • [27] A Generalized Transformer-Based Pulse Detection Algorithm
    Dematties, Dario
    Wen, Chenyu
    Zhang, Shi-Li
    [J]. ACS SENSORS, 2022, 7 (09) : 2710 - 2720
  • [28] Survey of Transformer-Based Object Detection Algorithms
    Li, Jian
    Du, Jianqiang
    Zhu, Yanchen
    Guo, Yongkun
    [J]. Computer Engineering and Applications, 2023, 59 (10) : 48 - 64
  • [29] Transformer-based mass detection in digital mammograms
    Betancourt Tarifa A.S.
    Marrocco C.
    Molinara M.
    Tortorella F.
    Bria A.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (03) : 2723 - 2737
  • [30] A transformer-based IDE plugin for vulnerability detection
    Mamede, Claudia
    Pinconschi, Eduard
    Abreu, Rui
    [J]. PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,