Enhancing Language Identification in Indian Context Through Exploiting Learned Features with Wav2Vec2.0

被引:2
|
作者
Gupta, Shivang [1 ]
Motepalli, Kowshik Siva Sai [1 ]
Kumar, Ravi [1 ]
Narasinga, Vamsi [1 ]
Mirishkar, Sai Ganesh [1 ]
Vuppala, Anil Kumar [1 ]
机构
[1] Int Inst Informat Technol Hyderabad, Hyderabad, India
来源
关键词
Language identification; Wav2vec2.0; Self-attention mechanism; Equal error rate;
D O I
10.1007/978-3-031-48312-7_40
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This work proposes the utilization of a self-supervised pre-trained network for developing a Language Identification (LID) system catering to low-resource Indian languages. The framework employed is Wav2vec2.0-XLSR-53, pre-trained on 53k hours of unlabeled speech data. The unsupervised training of the model enables it to learn the acoustic patterns specific to a language. Given that languages share phonetic space, multi-lingual pre-training is instrumental in learning cross-lingual information and building systems that cater to low-resource languages. Further fine-tuning with a limited amount of labeled data significantly boosts the model's accuracy. The results showcase a relative improvement of 33.2% over the DNN-A (DNN with attention) model and 19.04% over Dense Resnets for the Language Identification task on the IIITH-ILSC database using the proposed features (Shivang Gupta and Kowshik Siva Sai Motepalli share first authorship).
引用
收藏
页码:503 / 512
页数:10
相关论文
共 25 条
  • [21] Improving Tone Recognition Performance using Wav2vec 2.0-Based Learned Representation in Yoruba, a Low-Resourced Language
    Obiang, Saint germes b. bengono
    Tsopze, Norbert
    Yonta, Paulin melatagia
    Bonastre, Jean-francois
    Jimenez, Tania
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (12)
  • [22] Audio Features from the Wav2Vec 2.0 Embeddings for the ACM Multimedia 2022 Stuttering Challenge
    Montacie, Claude
    Caraty, Marie-Jose
    Lackovic, Nikola
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7195 - 7199
  • [23] 基于Wav2vec2.0神经网络的轨道交通钢轨损伤压电阵列超声导波定位方法
    刘思昊
    钱鲁斌
    梅曜华
    邢宇辉
    城市轨道交通研究, 2023, 26 (06) : 101 - 105+110
  • [24] Balanced-Wav2Vec: Enhancing Stability and Robustness of Representation Learning Through Sample Reweighting Techniques
    Lee, Mun-Hak
    Lee, Jae-Hong
    Kim, DoHee
    Kol, Ye-Eun
    Chang, Joon-Hyuk
    INTERSPEECH 2024, 2024, : 5058 - 5062
  • [25] Identification of Hate Speech and Abusive Language on Indonesian Twitter Using theWord2vec, Part of Speech and Emoji Features
    Ibrohim, Muhammad Okky
    Setiadi, Muhammad Akbar
    Budi, Indra
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,