AN EFFICIENT TRANSFORMER-BASED MODEL FOR VOICE ACTIVITY DETECTION

被引:1
|
作者
Zhao, Yifei [1 ]
Champagne, Benoit [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Voice activity detection; transformer-based architecture; audio fingerprinting; NOISE;
D O I
10.1109/MLSP55214.2022.9943501
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice Activity Detection (VAD) aims to distinguish, at a given time, between desired speech and non-speech. Although many state-of-the-art approaches for increasing the performance of VAD have been proposed, they are still not robust enough to be applied under adverse noise conditions with low Signal-to-Noise Ratio (SNR). To deal with this issue, we propose a novel transformer-based architecture for VAD with reduced computational complexity by implementing efficient depth-wise convolutions on feature patches. The proposed model, named Tr-VAD, demonstrates better performance compared to baseline methods from the literature in a variety of scenarios considered with the smallest possible number of parameters. The results also indicate that the use of a combination of Audio Fingerprinting (ARP) features with Tr-VAD can guarantee better performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] ETDNet: Efficient Transformer-Based Detection Network for Surface Defect Detection
    Zhou, Hantao
    Yang, Rui
    Hu, Runze
    Shu, Chang
    Tang, Xiaochu
    Li, Xiu
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [2] Efficient crop row detection using transformer-based parameter prediction
    Guo, Zhiming
    Quan, Longzhe
    Sun, Deng
    Lou, Zhaoxia
    Geng, Yuhang
    Chen, Tianbao
    Xue, Yi
    He, Jinbing
    Hou, Pengbiao
    Wang, Chuan
    Wang, Jiakang
    [J]. BIOSYSTEMS ENGINEERING, 2024, 246 : 13 - 25
  • [3] A Transformer-based Model for Older Adult Behavior Change Detection
    Akbari, Fateme
    Sartipi, Kamran
    [J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2022), 2022, : 27 - 35
  • [4] A Hybrid Transformer-Based Model for Optimizing Fake News Detection
    Al-Quayed, Fatima
    Javed, Danish
    Jhanjhi, N.Z.
    Humayun, Mamoona
    Alnusairi, Thanaa S.
    [J]. IEEE Access, 2024, 12 : 160822 - 160834
  • [5] Transformer-based Approaches for Personality Detection using the MBTI Model
    Lazo Vasquez, Ricardo
    Ochoa-Luna, Jose
    [J]. 2021 XLVII LATIN AMERICAN COMPUTING CONFERENCE (CLEI 2021), 2021,
  • [6] Transformer-Based Approach to Melanoma Detection
    Cirrincione, Giansalvo
    Cannata, Sergio
    Cicceri, Giovanni
    Prinzi, Francesco
    Currieri, Tiziana
    Lovino, Marta
    Militello, Carmelo
    Pasero, Eros
    Vitabile, Salvatore
    [J]. SENSORS, 2023, 23 (12)
  • [7] A Transformer-Based GAN for Anomaly Detection
    Yang, Caiyin
    Lan, Shiyong
    Huangl, Weikang
    Wang, Wenwu
    Liul, Guoliang
    Yang, Hongyu
    Ma, Wei
    Li, Piaoyang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 345 - 357
  • [8] Transformer-Based Fire Detection in Videos
    Mardani, Konstantina
    Vretos, Nicholas
    Daras, Petros
    [J]. SENSORS, 2023, 23 (06)
  • [9] Transformer-based fall detection in videos
    Nunez-Marcos, Adrian
    Arganda-Carreras, Ignacio
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [10] Transformer-based Text Detection in the Wild
    Raisi, Zobeir
    Naiel, Mohamed A.
    Younes, Georges
    Wardell, Steven
    Zelek, John S.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3156 - 3165