An optimized iterative clustering framework for recognizing speech

被引:10
|
作者
Palanivinayagam, Ashokkumar [1 ]
Nagarajan, Sureshkumar [1 ]
机构
[1] VIT Univ, Sch Engn & Comp Sci, Vellore 632014, Tamil Nadu, India
关键词
Speech document clustering; Iterative speech error correction; Similarity of documents; Probability clustering; Speech mining;
D O I
10.1007/s10772-020-09728-5
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the recent years, many research methodologies are proposed to recognize the spoken language and translate them to text. In this paper, we propose a novel iterative clustering algorithm that makes use of the translated text and reduces error in it. The proposed methodology involves three steps executed over many iterations, namely: (1) unknown word probability assignment, (2) multi-probability normalization, and (3) probability filtering. In the first case, each iteration learns the unknown words from previous iterations and assigns a new probability to the unknown words based on the temporary results obtained in the previous iteration. This process continues until there are no unknown words left. The second case involves normalization of multiple probabilities assigned to a single word by considering neighbour word probabilities. The last step is to eliminate probabilities below the threshold, which ensures the reduction of noise. We measure the quality of clustering with many real-world benchmark datasets. Results show that our optimized algorithm produces more accurate clustering compared to other clustering algorithms.
引用
收藏
页码:767 / 777
页数:11
相关论文
共 50 条
  • [1] An optimized iterative clustering framework for recognizing speech
    Ashokkumar Palanivinayagam
    Sureshkumar Nagarajan
    International Journal of Speech Technology, 2020, 23 : 767 - 777
  • [2] iterClust: a statistical framework for iterative clustering analysis
    Ding, Hongxu
    Wang, Wanxin
    Califano, Andrea
    BIOINFORMATICS, 2018, 34 (16) : 2865 - 2866
  • [3] The Iterative Clustering framework for the CMS HGCAL Reconstruction
    Pantaleo, Felice
    Rovere, Marco
    20TH INTERNATIONAL WORKSHOP ON ADVANCED COMPUTING AND ANALYSIS TECHNIQUES IN PHYSICS RESEARCH, 2023, 2438
  • [4] Structured GMM Based on Unsupervised Clustering for Recognizing Adult and Child Speech
    Gorin, Arseniy
    Jouvet, Denis
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 108 - 119
  • [5] Active learning framework with iterative clustering for bioimage classification
    Kutsuna, Natsumaro
    Higaki, Takumi
    Matsunaga, Sachihiro
    Otsuki, Tomoshi
    Yamaguchi, Masayuki
    Fujii, Hirofumi
    Hasezawa, Seiichiro
    NATURE COMMUNICATIONS, 2012, 3
  • [6] Active learning framework with iterative clustering for bioimage classification
    Natsumaro Kutsuna
    Takumi Higaki
    Sachihiro Matsunaga
    Tomoshi Otsuki
    Masayuki Yamaguchi
    Hirofumi Fujii
    Seiichiro Hasezawa
    Nature Communications, 3
  • [7] CONIC: Contour Optimized Non-Iterative Clustering Superpixel Segmentation
    Li, Cheng
    Guo, Baolong
    Liao, Nannan
    Gong, Jianglei
    Han, Xiaodong
    Hou, Shuwei
    Chen, Zhijie
    He, Wangpeng
    REMOTE SENSING, 2021, 13 (06)
  • [8] An iterative MapReduce framework for sports-based tweet clustering
    Saxena, Gaurangi
    Santurkar, Siddharth
    6TH INTERNATIONAL CONFERENCE ON COMPUTER & COMMUNICATION TECHNOLOGY (ICCCT-2015), 2015, : 9 - 14
  • [9] Optimized clustering-based discovery framework on Internet of Things
    Bharti, Monika
    Jindal, Himanshu
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (02): : 1739 - 1778
  • [10] Optimized clustering-based discovery framework on Internet of Things
    Monika Bharti
    Himanshu Jindal
    The Journal of Supercomputing, 2021, 77 : 1739 - 1778