DOMAIN ATTENTIVE FUSION FOR END-TO-END DIALECT IDENTIFICATION WITH UNKNOWN TARGET DOMAIN

被引:0
|
作者
Shon, Suwon [1 ]
Ali, Ahmed [2 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] HBKU, Qatar Comp Res Inst, Doha, Qatar
关键词
Dialect identification; language identification; self-attention; fusion;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end deep learning language or dialect identification systems operate on the spectrogram or other acoustic feature and directly generate identification scores for each class. An important issue for end-to-end systems is to have some knowledge of the application domain, because the system can be vulnerable to use cases that were not seen in the training phase; such a scenario is often referred to as a domain mismatched condition. In general, we assume that there is enough variation in the training dataset to expose the system to multiple domains. In this work, we study how to best make use a training dataset in order to have maximum effectiveness on unknown target domains. Our goal is to process the input without any knowledge of the target domain while preserving robust performance on other domains as well. To accomplish this objective, we propose a domain attentive fusion approach for end-to-end dialect/language identification systems. To help with experimentation, we collect a dataset from three different domains, and create experimental protocols for a domain mismatched condition. The results of our proposed approach, which were tested on a variety of broadcast and YouTube data, shows significant performance gain compared to traditional approaches, even without any prior target domain information.
引用
收藏
页码:5951 / 5955
页数:5
相关论文
共 50 条
  • [21] A Cross-Domain Framework for Coordinated End-to-End QoS Adaptation
    Zhou, LiFeng
    Pung, Hung Keng
    Ngoh, Lek Heng
    [J]. 2008 IEEE 33RD CONFERENCE ON LOCAL COMPUTER NETWORKS, VOLS 1 AND 2, 2008, : 521 - +
  • [22] Provision of end-to-end QoS in heterogeneous multi-domain networks
    Burakowski, Wojciech
    Beben, Andrzej
    Tarasiuk, Halina
    Sliwinski, Jaroslaw
    Janowski, Robert
    Batalla, Jordi Mongay
    Krawiec, Piotr
    [J]. ANNALS OF TELECOMMUNICATIONS, 2008, 63 (11-12) : 559 - 577
  • [23] Dialect-Aware Modeling for End-to-End Japanese Dialect Speech Recognition
    Imaizumi, Ryo
    Masumura, Ryo
    Shiota, Sayaka
    Kiya, Hitoshi
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 297 - 301
  • [24] Provision of end-to-end QoS in heterogeneous multi-domain networks
    Wojciech Burakowski
    Andrzej Bęben
    Halina Tarasiuk
    Jarosław Śliwiński
    Robert Janowski
    Jordi Mongay Batalla
    Piotr Krawiec
    [J]. annals of telecommunications - annales des télécommunications, 2008, 63 : 559 - 577
  • [25] IMS Intra- and Inter Domain End-to-End Resilience Analysis
    Kamyod, Chayapol
    Nielsen, Rasmus Hjorth
    Prasad, Neeli Rashmi
    Prasad, Ramjee
    [J]. 2013 3RD INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, VEHICULAR TECHNOLOGY, INFORMATION THEORY AND AEROSPACE & ELECTRONIC SYSTEMS (VITAE), 2013,
  • [26] End-to-End Language Identification Using a Residual Convolutional Neural Network with Attentive Temporal Pooling
    Monteiro, Joao
    Alam, Jahangir
    Bhattacharya, Gautam
    Falk, Tiago H.
    [J]. 2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [27] END-TO-END DEEP LEARNING-BASED ADAPTATION CONTROL FOR FREQUENCY-DOMAIN ADAPTIVE SYSTEM IDENTIFICATION
    Haubner, Thomas
    Brendel, Andreas
    Kellermann, Walter
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 766 - 770
  • [28] Deep learning-based end-to-end spoken language identification system for domain-mismatched scenario
    Kang, Woo Hyun
    Alam, Jahangir
    Fathan, Abderrahim
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7339 - 7343
  • [29] Hybrid domain adaptation with deep network architecture for end-to-end cross-domain human activity recognition
    Prabono, Aria Ghora
    Yahya, Bernardo Nugroho
    Lee, Seok-Lyong
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 151
  • [30] Hybrid domain adaptation with deep network architecture for end-to-end cross-domain human activity recognition
    Prabono, Aria Ghora
    Yahya, Bernardo Nugroho
    Lee, Seok-Lyong
    [J]. Computers and Industrial Engineering, 2021, 151