Formant tracking using context-dependent phonemic information

被引:23
|
作者
Lee, M [1 ]
van Santen, J
Möbius, B
Olive, J
机构
[1] Lucent Technol, Bell Labs, Murray Hill, NJ 07974 USA
[2] Oregon Hlth & Sci Univ, Oregon Grad Inst Sci & Technol, Sci Sci & Engn, Beaverton, OR 97006 USA
[3] Univ Stuttgart, IMS, D-70174 Stuttgart, Germany
[4] DARPA IPTO, Arlington, VA 22203 USA
来源
关键词
automatic segmentation; coarticulation; dynamic programming; formant tracking; speech analysis;
D O I
10.1109/TSA.2005.851904
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new formant-tracking algorithm using phoneme information is proposed. Conventional formant-tracking algorithms obtain formant tracks by analyzing the acoustic speech signal using continuity constraints without any additional information. The formant-tracking error rate of the conventional methods is reportedly in the range of 10%-20%. In this paper, we show that if text or phoneme transcription of speech utterances is available, the error rate can be significantly reduced. The basic idea behind this approach is that given the phoneme identity, formant-tracking algorithms can have a better clue of where to look for formants. The algorithm consists of three phases: 1) analysis, 2) segmentation and alignment, and 3) formant tracking by the Viterbi searching algorithm. In the analysis phase, formant candidates are obtained for each analysis frame by solving the linear prediction polynomial. In the segmentation and alignment phase, the text corresponding to the input speech utterance is converted into a sequence of phoneme symbols. Then, the phoneme sequence is time aligned with the speech utterance. A hidden Markov model (HMM) based automatic segmentation algorithm is used for forced-time alignment. For each phoneme segment, nominal formant frequencies are assigned at the center of each phoneme segment. Then nominal formant tracks for the entire utterance are obtained by interpolating the nominal formant frequencies. In order to compensate for the coarticulation effect, different interpolation methods are used depending on the phonemic context. The interpolation process makes the formant-tracking algorithm robust to possible segmentation errors made by the HMM-based segmentation algorithm. As a result, the proposed formant-tracking algorithm does not require highly accurate alignment/segmentation. Finally, a set of formants is chosen from the formant candidates in such a way that the resulting formant tracks come close to the nominal formant tracks while satisfying the continuity constraints. The algorithm is tested using natural speech utterances and the performance is compared against formant tracks obtained by the conventional method using continuity constraints only. The new algorithm significantly reduces the formant-tracking error rate (5.03% for male and 3.73% for female) over the conventional formant-tracking algorithm (13.00% for male and 15.82% for female).
引用
收藏
页码:741 / 750
页数:10
相关论文
共 50 条
  • [1] Representing context-dependent information using Multidimensional XML
    Stavrakas, Y
    Gergatsoulis, M
    Mitakos, T
    [J]. RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 2000, 1923 : 368 - 371
  • [2] Context-dependent beat tracking of musical audio
    Davies, Matthew E. P.
    Plumbley, Mark D.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 1009 - 1020
  • [3] CONTEXT-INDEPENDENT AND CONTEXT-DEPENDENT INFORMATION IN CONCEPTS
    BARSALOU, LW
    [J]. MEMORY & COGNITION, 1982, 10 (01) : 82 - 93
  • [4] Context-dependent information space for construction information processes
    Hilbert, F.
    Schuelbe, R.
    Fuchs, S.
    [J]. SUSTAINABLE BUILT ENVIRONMENT D-A-CH CONFERENCE 2019 (SBE19 GRAZ), 2019, 323
  • [5] Context-dependent access control for contextual information
    Groba, Christin
    Grob, Stephan
    Springer, Thomas
    [J]. ARES 2007: SECOND INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, PROCEEDINGS, 2007, : 155 - +
  • [6] REPRESENTING CONTEXT-DEPENDENT INFORMATION IN CULTURAL COLLECTIONS
    Gergatsoulis, Manolis
    Lilis, Pantelis D.
    Lourdi, Irene
    Papatheodorou, Christos
    [J]. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2009, 3 (02) : 255 - 276
  • [7] Context-dependent information processing in patients with schizophrenia
    Bazin, N
    Perruchet, P
    Hardy-Bayle, MC
    Feline, A
    [J]. SCHIZOPHRENIA RESEARCH, 2000, 45 (1-2) : 93 - 101
  • [8] Automated detection of unstructured context-dependent sensitive information using deep learning
    Ahmed, Hadeer
    Traore, Issa
    Saad, Sherif
    Mamun, Mohammad
    [J]. INTERNET OF THINGS, 2021, 16
  • [9] Information flow in context-dependent hierarchical Bayesian inference
    Fields, Chris
    Glazebrook, James F.
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (01) : 111 - 142
  • [10] Context-Dependent Score Based Bayesian Information Criteria
    Underhill, N. T.
    Smith, J. Q.
    [J]. BAYESIAN ANALYSIS, 2016, 11 (04): : 1005 - 1033