Unsupervised spoken term discovery using pseudo lexical induction

被引:0
|
作者
Sudhakar P. [1 ]
Sreenivasa Rao K. [2 ]
Mitra P. [2 ]
机构
[1] Advanced Technology Development Centre, Indian Institute of Technology, West Bengal, Kharagpur
[2] Department of Computer Science and Engineering, Indian Institute of Technology, West Bengal, Kharagpur
关键词
Context-free grammar; Pattern matching; Self-organising map; Speech processing; Spoken term discovery; Zero-resource;
D O I
10.1007/s10772-023-10049-6
中图分类号
学科分类号
摘要
An unsupervised spoken term discovery task aims to capture the pattern similarities among spoken terms in the absence of annotation. Such an approach is useful for the untranscribed spoken content from low-resource or zero-resource languages. A challenge in the discovery task is to compute the similarities among spoken terms without annotation. Dynamic time warping (DTW) is one of the techniques that computes temporal alignment between two acoustic feature representations of the speech signal without annotation. However, the speech variabilities that arise in natural speech introduce a challenge to the DTW approach. As a result, the performance of the spoken term discovery task was degraded. In this study, we overcome the challenges and improve the performance of the discovery task in three stages. At first, the speaker-independent acoustic feature representation was obtained from the Self Organising Map (SOM) to reduce the variabilities. In the second stage, non-segmental pseudo-labels were generated for the spoken content using context-free grammar. Finally, the spoken term similarities were obtained by grouping the similar sequences using proposed Label Sequence Similarity Mapping and Language modelling algorithms. The performance of the proposed system was measured using the Zero-Speech challenge corpus in terms of matching, clustering and parsing qualities. The experimental results reveal that the proposed approach improves the performance by 34.2% and 22.4% in English and Xitsonga, respectively, across multiple speakers. In addition, the clustering performance of the spoken terms at the word level was improved by 4.2% in English. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:801 / 816
页数:15
相关论文
共 50 条
  • [21] Simulating Zero-Resource Spoken Term Discovery
    White, Jerome
    Oard, Douglas W.
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2371 - 2374
  • [22] WEAKLY SUPERVISED SPOKEN TERM DISCOVERY USING CROSS-LINGUAL SIDE INFORMATION
    Bansal, Sameer
    Kamper, Herman
    Goldwater, Sharon
    Lopez, Adam
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5760 - 5764
  • [23] Acquisition of Lexical Semantics through Unsupervised Discovery of Associations between Perceptual Symbols
    Oezer, Tuna
    [J]. 2008 IEEE 7TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, 2008, : 19 - 24
  • [24] TOWARD UNSUPERVISED MODEL-BASED SPOKEN TERM DETECTION WITH SPOKEN QUERIES WITHOUT ANNOTATED DATA
    Chan, Chun-an
    Chung, Cheng-Tao
    Kuo, Yu-Hsin
    Lee, Lin-shan
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8550 - 8554
  • [25] Audio Mining: Unsupervised Spoken Term Detection over an Audio Database
    Kumar, Kishore R.
    Sarkar, Sandipan
    Rengaswamy, Pradeep
    Rao, K. Sreenivasa
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 514 - 518
  • [26] Unsupervised classification of biomedical abstracts using lexical association
    Read, Jonathon
    Webster, Jonathan
    Fang, Alex Chengyu
    [J]. PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation, 2010, : 261 - 270
  • [27] Unsupervised Classification of Biomedical Abstracts using Lexical Association
    Read, Jonathon
    Webster, Jonathan
    Fang, Alex Chengyu
    [J]. PROCEEDINGS OF THE 24TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2010, : 261 - 270
  • [28] Unsupervised Learning of Continuous Density HMM for Variable-Length Spoken Unit Discovery
    Sun, Meng
    Van Hamme, Hugo
    Wang, Yimin
    Zhang, Xiongwei
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (01): : 296 - 299
  • [29] An Evaluation of Graph Clustering Methods for Unsupervised Term Discovery
    Lyzinski, Vince
    Sell, Gregory
    Jansen, Aren
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3209 - 3213
  • [30] Adaptation of Unsupervised Term Discovery for Speech to Sign Languages
    Polat, Korhan
    Saraclar, Murat
    [J]. 2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2020,