BAG OF ARCS: NEW REPRESENTATION OF SPEECH SEGMENT FEATURES BASED ON FINITE STATE MACHINES

被引:0
|
作者
Watanabe, Shinji [1 ]
Kubo, Yotaro [1 ]
Oba, Takanobu [1 ]
Hori, Takaaki [1 ]
Nakamura, Atsushi [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, Kyoto, Japan
关键词
Speech segment feature; finite state machine; Bag Of Arcs (BOA); speaker recognition; utterance classification; MODEL;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a new feature representation, Bag Of Arcs (BOA) for speech segments. A speech segment in BOA is simply represented as a set of counts for unique arcs in a finite state machine. Similar to the Bag Of Words model (BOW), BOA disregards the order of arcs, and thus, efficiently models speech segments. A strong motivation to use BOA is provided by a fact that the BOA representation is tightly connected to the output of a Weighted Finite State Transducer (WFST) based ASR decoder. Thus, BOA directly represents elements in the search network of a WFST-based ASR decoder, and can include information about context-dependent HMM topologies, lexicons, and back-off smoothed n-gram networks. In addition, the counts of BOA are accumulated by using the WFST decoder output directly, and we do not require an additional overhead and a change of decoding algorithms to extract the features. Consequently, we can combine the ASR decoder and post-processing without a process to extract word features from the decoder outputs or re-compiling WFST networks. We show the effectiveness of the proposed approach for some ASR post-processing applications in utterance classification experiments, and in speaker adaptation experiments by achieving absolute 1% improvement in WER from baseline results. We also show examples of latent semantic analysis for BOA by using latent Dirichlet allocation.
引用
收藏
页码:4201 / 4204
页数:4
相关论文
共 50 条
  • [1] POLYNOMIAL REPRESENTATION OF FINITE-STATE MACHINES
    HUNT, BR
    [J]. IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, 1969, SSC5 (01): : 94 - &
  • [2] An efficient representation for implementing finite state machines based on the double-array
    Mizobuchi, S
    Sumitomo, T
    Fuketa, M
    Aoe, J
    [J]. INFORMATION SCIENCES, 2000, 129 (1-4) : 119 - 139
  • [3] GALOIS FIELD REPRESENTATION OF FINITE STATE MACHINES.
    Zhang, Yan-zhong
    [J]. Cambridge University, Engineering Department (Technical Report) CUED/B-Elect, 1984,
  • [4] Segment abstraction-integrate algorithm based on polygon approximation and finite state machines
    Lu, Xin-Qiao
    [J]. Ruan Jian Xue Bao/Journal of Software, 2008, 19 (SUPPL.): : 52 - 58
  • [5] A Hybrid Speech Signal Based Algorithm for Pitch Marking Using Finite State Machines
    Hussein, H.
    Wolff, M.
    Jokisch, O.
    Duckhorn, F.
    Strecha, G.
    Hoffmann, R.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 135 - 138
  • [6] Autocorrelation-Based Features for Speech Representation
    Ando, Yoichi
    [J]. ACTA ACUSTICA UNITED WITH ACUSTICA, 2015, 101 (01) : 145 - 154
  • [7] Developing Speech Dialogs For Multimodal HMIs Using Finite State Machines
    Goronzy, Silke
    Mochales, Raquel
    Beringer, Nicole
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1774 - 1777
  • [8] Speech emotion recognition based on prosodic segment level features
    Han, Wenjing
    Li, Haifeng
    [J]. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2009, 49 (SUPPL. 1): : 1363 - 1368
  • [9] Observed Data-Based Model Construction of Finite State Machines Using Exponential Representation of LMs
    Yan, Yongyi
    Yue, Jumei
    Chen, Zengqiang
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (02) : 434 - 438
  • [10] State Encoding based NBTI Optimization in Finite State Machines
    Pendyala, ShiJpa
    Katkoori, Srinivas
    [J]. PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN ISQED 2016, 2016, : 416 - 422