DOMAIN ATTENTIVE FUSION FOR END-TO-END DIALECT IDENTIFICATION WITH UNKNOWN TARGET DOMAIN

被引:0
|
作者
Shon, Suwon [1 ]
Ali, Ahmed [2 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] HBKU, Qatar Comp Res Inst, Doha, Qatar
关键词
Dialect identification; language identification; self-attention; fusion;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end deep learning language or dialect identification systems operate on the spectrogram or other acoustic feature and directly generate identification scores for each class. An important issue for end-to-end systems is to have some knowledge of the application domain, because the system can be vulnerable to use cases that were not seen in the training phase; such a scenario is often referred to as a domain mismatched condition. In general, we assume that there is enough variation in the training dataset to expose the system to multiple domains. In this work, we study how to best make use a training dataset in order to have maximum effectiveness on unknown target domains. Our goal is to process the input without any knowledge of the target domain while preserving robust performance on other domains as well. To accomplish this objective, we propose a domain attentive fusion approach for end-to-end dialect/language identification systems. To help with experimentation, we collect a dataset from three different domains, and create experimental protocols for a domain mismatched condition. The results of our proposed approach, which were tested on a variety of broadcast and YouTube data, shows significant performance gain compared to traditional approaches, even without any prior target domain information.
引用
收藏
页码:5951 / 5955
页数:5
相关论文
共 50 条
  • [31] End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification
    Li, Zheng
    Zhang, Yu
    Wei, Ying
    Wu, Yuxiang
    Yang, Qiang
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2237 - 2243
  • [32] Domain- and Structure-Agnostic End-to-End Entity Resolution with JedAI
    Papadakis, George
    Tsekouras, Leonidas
    Thanos, Emmanouil
    Giannakopoulos, George
    Palpanas, Themis
    Koubarakis, Manolis
    [J]. SIGMOD RECORD, 2019, 48 (04) : 30 - 36
  • [33] CONTINUAL SELF-SUPERVISED DOMAIN ADAPTATION FOR END-TO-END SPEAKER DIARIZATION
    Coria, Juan M.
    Bredin, Herve
    Ghannay, Sahar
    Rosset, Sophie
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 626 - 632
  • [34] DOMAIN ADAPTATION OF END-TO-END SPEECH RECOGNITION IN LOW-RESOURCE SETTINGS
    Samarakoon, Lahiru
    Mak, Brian
    Lam, Albert Y. S.
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 382 - 388
  • [35] Model Based Testing of End-to-End Chains using Domain Specific Languages
    Hartmann, Tobias
    [J]. 2009 TESTING: ACADEMIC AND INDUSTRIAL CONFERENCE-PRACTICE AND RESEARCH TECHNIQUES, TAIC PART 2009, 2009, : 82 - 91
  • [36] An architecture for end-to-end and inter-domain trusted mail delivery service
    Ayla, Erkut Sinan
    Ozgit, Attila
    [J]. ISCN '06: PROCEEDINGS OF THE 7TH INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORKS, 2006, : 220 - +
  • [37] INVESTIGATIONS ON END-TO-END AUDIOVISUAL FUSION
    Wand, Michael
    Ngoc Thang Vu
    Schmidhuber, Juergen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 3041 - 3045
  • [38] End-to-end protection issues with optical multi-domain transport networks
    Nieto, L
    Joyce, M
    Wang, S
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIII, PROCEEDINGS: INDUSTRIAL SYSTEMS, 2004, : 144 - 149
  • [39] End-to-end Shared Restoration Algorithms in Multi-domain Mesh Networks
    Gao, Zhiying
    Naser, Hassan
    [J]. 2008 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1-3, 2008, : 933 - 938
  • [40] End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
    Shakeri, Siamak
    Dos Santos, Cicero Nogueira
    Zhu, Henry
    Ng, Patrick
    Nan, Feng
    Wang, Zhiguo
    Nallapati, Ramesh
    Xiang, Bing
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5445 - 5460