Towards Cross-Corpora Generalization for Low-Resource Spoken Language Identification

被引:0
|
作者
Dey, Spandan [1 ,2 ]
Sahidullah, Md [3 ,4 ]
Saha, Goutam [1 ]
机构
[1] Indian Inst Technol Kharagpur, Dept E & ECE, Kharagpur 721302, India
[2] Samsung R&D Inst India Bangalore, Bengaluru 560037, India
[3] TCG CREST, Inst Adv Intelligence, Bidhannagar 700091, India
[4] Acad Sci & Innovat Res, Ghaziabad 201002, India
关键词
Vectors; Training; Correlation; Speech processing; Recording; Noise; Measurement; Databases; Training data; NIST; Spoken language identification; low-resource; cross-corpora evaluation; corpora mismatch; domain invariance;
D O I
10.1109/TASLP.2024.3492807
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Low-resource spoken language identification (LID) systems are prone to poor generalization across unknown domains. In this study, using multiple widely used low-resourced South Asian LID corpora, we conduct an in-depth analysis for understanding the key non-lingual bias factors that create corpora mismatch and degrade LID generalization. To quantify the biases, we extract different data-driven and rule-based summary vectors that capture non-lingual aspects, such as speaker characteristics, spoken context, accents or dialects, recording channels, background noise, and environments. We then conduct a statistical analysis to identify the most crucial non-lingual bias factors and corpora mismatch components that impact LID performance. Following these analyses, we then propose effective bias compensation approaches for the most relevant summary vectors. We generate pseudo-labels using hierarchical clustering over language-domain-gender constrained summary vectors and use them to train adversarial networks with conditioned metric loss. The compensations learn invariance for the corpora mismatches due to the non-lingual biases and help to improve the generalization. With the proposed compensation method, we improve equal error rate up to 5.22% and 8.14% for the same-corpora and cross-corpora evaluations, respectively.
引用
收藏
页码:5040 / 5050
页数:11
相关论文
共 50 条
  • [41] Learning Bilingual Lexicon for Low-Resource Language Pairs
    Zhu, ShaoLin
    Li, Xiao
    Yang, YaTing
    Wang, Lei
    Mi, ChengGang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 760 - 770
  • [42] NLPashto: NLP Toolkit for Low-resource Pashto Language
    Haq, Ijazul
    Qiu, Weidong
    Guo, Jie
    Tang, Peng
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1344 - 1352
  • [43] Automatic Labeling of Clusters for a Low-Resource Urdu Language
    Nasim, Zarmeen
    Haider, Sajjad
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
  • [44] Building a Dataset for Misinformation Detection in the Low-Resource Language
    Mukwevho, Mulweli
    Rananga, Seani
    Mbooi, Mahlatse S.
    Isong, Bassey
    Marivate, Vukosi
    2024 IST-AFRICA CONFERENCE, 2024,
  • [45] On the study of very low-resource language keyword search
    Van Tung Pham
    Xu, Haihua
    Van Hai Do
    Chong, Tze Yuang
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 358 - 364
  • [46] An automated approach to identify sarcasm in low-resource language
    Khan, Shumaila
    Qasim, Iqbal
    Khan, Wahab
    Khan, Aurangzeb
    Khan, Javed Ali
    Qahmash, Ayman
    Ghadi, Yazeed Yasin
    PLOS ONE, 2024, 19 (12):
  • [47] Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism
    Soky, Kak
    Li, Sheng
    Mimura, Masato
    Chu, Chenhui
    Kawahara, Tatsuya
    INTERSPEECH 2022, 2022, : 1362 - 1366
  • [48] DISCRIMINATIVE ARTICULATORY MODELS FOR SPOKEN TERM DETECTION IN LOW-RESOURCE CONVERSATIONAL SETTINGS
    Prabhavalkar, Rohit
    Livescu, Karen
    Fosler-Lussier, Eric
    Keshet, Joseph
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8287 - 8291
  • [49] Cross-Lingual Transfer with Language-Specific Subnetworks for Low-Resource Dependency Parsing
    Choenni, Rochelle
    Garrette, Dan
    Shutova, Ekaterina
    COMPUTATIONAL LINGUISTICS, 2023, 49 (03) : 613 - 641
  • [50] EXEMPLAR-INSPIRED STRATEGIES FOR LOW-RESOURCE SPOKEN KEYWORD SEARCH IN SWAHILI
    Chen, Nancy F.
    Van Thng Pham
    Xu, Haihua
    Xiao, Xiong
    Van Hai Do
    Ni, Chongjia
    Chen, I-Fan
    Sivadas, Sunil
    Lee, Chin-Hui
    Chng, Eng Siong
    Ma, Bin
    Li, Haizhou
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6040 - 6044