Significance of neural phonotactic models for large-scale spoken language identification

被引:0
|
作者
Srivastava, Brij Mohan Lal [1 ]
Vydana, Hari [1 ]
Vuppala, Anil Kumar [1 ]
Shrivastava, Manish [1 ]
机构
[1] Int Inst Informat Technol, Language Technol Res Ctr, Hyderabad, Andhra Pradesh, India
关键词
RECOGNITION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Language identification (LID) is vital frontend for spoken dialogue systems operating in diverse linguistic settings to reduce recognition and understanding errors. Existing LID systems which use low-level signal information for classification do not scale well due to exponential growth of parameters as the classes increase. They also suffer performance degradation due to the inherent variabilities of speech signal. In the proposed approach, we model the language-specific phonotactic information in speech using recurrent neural network for developing an LID system. The input speech signal is tokenized to phone sequences by using a common language-independent phone recognizer with varying phonetic coverage. We establish a causal relationship between phonetic coverage and LID performance. The phonotactics in the observed phone sequences are modeled using statistical and recurrent neural network language models to predict language-specific symbol from a universal phonetic inventory. Proposed approach is robust, computationally light weight and highly scalable. Experiments show that the convex combination of statistical and recurrent neural network language model (RNNLM) based phonotactic models significantly outperform a strong baseline system of Deep Neural Network (DNN) which is shown to surpass the performance of i-vector based approach for LID. The proposed approach outperforms the baseline models in terms of mean F1 score over 176 languages. Further we provide significant information-theoretic evidence to analyze the mechanism of the proposed approach.
引用
收藏
页码:2144 / 2151
页数:8
相关论文
共 50 条
  • [41] From language models to large-scale food and biomedical knowledge graphs
    Gjorgjina Cenikj
    Lidija Strojnik
    Risto Angelski
    Nives Ogrinc
    Barbara Koroušić Seljak
    Tome Eftimov
    Scientific Reports, 13
  • [42] Romanization-based Large-scale Adaptation of Multilingual Language Models
    Purkayastha, Sukannya
    Ruder, Sebastian
    Pfeiffer, Jonas
    Gurevych, Iryna
    Vulic, Ivan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 7996 - 8005
  • [43] FedID: Federated Interactive Distillation for Large-Scale Pretraining Language Models
    Ma, Xinge
    Liu, Jiangming
    Wang, Jin
    Zhang, Xuejie
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 8566 - 8577
  • [44] Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
    Xiao, Shitao
    Liu, Zheng
    Shao, Yingxia
    Di, Tao
    Middha, Bhuvan
    Wu, Fangzhao
    Xie, Xing
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4215 - 4225
  • [45] Training large-scale language models with limited GPU memory: a survey
    Tang, Yu
    Qiao, Linbo
    Yin, Lujia
    Liang, Peng
    Shen, Ao
    Yang, Zhilin
    Zhang, Lizhi
    Li, Dongsheng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2025, : 309 - 331
  • [46] Role Models in Language Learning: Results of a Large-Scale International Survey
    Muir, Christine
    Dornyei, Zoltan
    Adolphs, Svenja
    APPLIED LINGUISTICS, 2021, 42 (01) : 1 - 23
  • [47] A Study on Prompt Types for Harmlessness Assessment of Large-Scale Language Models
    Shin, Yejin
    Kim, Song-yi
    Byun, Eun Young
    HCI INTERNATIONAL 2024 POSTERS, PT VII, HCII 2024, 2024, 2120 : 228 - 233
  • [48] From language models to large-scale food and biomedical knowledge graphs
    Cenikj, Gjorgjina
    Strojnik, Lidija
    Angelski, Risto
    Ogrinc, Nives
    Seljak, Barbara Korousic
    Eftimov, Tome
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [49] StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
    Guo, Zhicheng
    Cheng, Sijie
    Wang, Hao
    Liang, Shihao
    Qin, Yujia
    Li, Peng
    Liu, Zhiyuan
    Sun, Maosong
    Liu, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11143 - 11156
  • [50] Motif neural network design for large-scale protein family identification
    Wu, CH
    Zhao, S
    Simmons, K
    Shivakumar, S
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 86 - 89