Utterance Verification Using State-Level Log-Likelihood Ratio with Frame and State Selection

被引:0
|
作者
Kwon, Suk-Bong [1 ]
Kim, Hoirin [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Taejon 305701, South Korea
关键词
utterance verification; confidence measure; likelihood ratio testing; state-level log-likelihood ratio; frame selection; state selection; CONFIDENCE MEASURES; SPEECH RECOGNITION;
D O I
10.1587/transinf.E93.D.647
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper suggests utterance verification system using state-level log-likelihood ratio with frame and state selection. We use hidden Markov models for speech recognition and utterance verification as acoustic models and anti-phone models. The hidden Markov models have three states and each state represents different characteristics of a phone. Thus we propose an algorithm to compute state-level log-likelihood ratio and give weights on states for obtaining more reliable confidence measure of recognized phones. Additionally, we propose a frame selection algorithm to compute confidence measure on frames including proper speech in the input speech. In general, phone segmentation information obtained from speaker-independent speech recognition system is not accurate because triphone-based acoustic models are difficult to effectively train for covering diverse pronunciation and coarticulation effect. So, it is more difficult to find the right matched states when obtaining state segmentation information. A state selection algorithm is suggested for finding valid states. The proposed method using state-level log-likelihood ratio with frame and state selection shows that the relative reduction in equal error rate is 18.1 % compared to the baseline system using simple phone-level log-likelihood ratios.
引用
收藏
页码:647 / 650
页数:4
相关论文
共 50 条
  • [1] Utterance Verification Using Word Voiceprint Models Based on Probabilistic Distributions of Phone-Level Log-Likelihood Ratio and Phone Duration
    Kwon, Suk-Bong
    Kim, HoiRin
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2008, E91D (11) : 2746 - 2750
  • [2] Speaker verification using frame and utterance level likelihood normalization
    Nakagawa, S
    Markov, KP
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1087 - 1090
  • [3] Generalized selection combining based on the log-likelihood ratio
    Kim, SW
    Kim, YG
    Simon, MK
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-5: NEW FRONTIERS IN TELECOMMUNICATIONS, 2003, : 2789 - 2794
  • [4] Generalized selection combining based on the log-likelihood ratio
    Kim, SW
    Kim, YG
    Simon, MK
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2004, 52 (04) : 521 - 524
  • [5] Speaker verification using normalized log-likelihood score
    Liu, CS
    Wang, HC
    Lee, CH
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1996, 4 (01): : 56 - 60
  • [6] Regression model selection via log-likelihood ratio and constrained minimum criterion
    Tsao, Min
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (01): : 195 - 211
  • [7] Log-Likelihood Ratio-based Relay Selection Algorithm for Cooperative Communications
    El-Mahdy, Ahmed
    Waleed, Ahmed
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, SIGNAL PROCESSING, AND THEIR APPLICATIONS (ICCSPA'15), 2015,
  • [8] A Relay Selection Method for Bidirectional Wireless Cooperative Networks Based on the Log-Likelihood Ratio
    Alexan, Wassim
    El Mahdy, Ahmed
    [J]. SPA 2015 SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS, 2015, : 134 - 138
  • [9] Log-likelihood ratio based generalized selection combining for M-ary signaling
    Kim, YG
    Kim, SW
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-7, 2004, : 229 - 233
  • [10] In-network channel decoding using consensus on log-likelihood ratio averages
    Zhu, Hao
    Cano, Alfonso
    Giannakis, Georgios B.
    [J]. 2008 42ND ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS, VOLS 1-3, 2008, : 1058 - 1063