Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech

被引：24

作者：

Guz, Umit ^{[1
,2
]}

Cuendet, Sebastien ^{[1
,3
]}

Hakkani-Tuer, Dilek ^{[1
]}

Tur, Gokhan ^{[4
]}

机构：

[1] Int Comp Sci Inst, Speech Grp, Berkeley, CA 94704 USA

[2] Isik Univ, Dept Elect Engn, Fac Engn, TR-34980 Istanbul, Turkey

[3] Optaros, CH-8037 Zurich, Switzerland

[4] SRI Int, Speech Technol & Res STAR Lab, Menlo Pk, CA 94025 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 02期

基金：

瑞士国家科学基金会;

关键词：

Boosting; co-training; prosody; self-training; semi-supervised learning; sentence segmentation;

D O I：

10.1109/TASL.2009.2028371

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Sentence segmentation of speech aims at determining sentence boundaries in a stream of words as output by the speech recognizer. Typically, statistical methods are used for sentence segmentation. However, they require significant amounts of labeled data, preparation of which is time-consuming, labor-intensive, and expensive. This work investigates the application of multi-view semi-supervised learning algorithms on the sentence boundary classification problem by using lexical and prosodic information. The aim is to find an effective semi-supervised machine learning strategy when only small sets of sentence boundary-labeled data are available. We especially focus on two semi-supervised learning approaches, namely, self-training and co-training. We also compare different example selection strategies for co-training, namely, agreement and disagreement. Furthermore, we propose another method, called self-combined, which is a combination of self-training and co-training. The experimental results obtained on the ICSI Meeting (MRDA) Corpus show that both multi-view methods outperform self-training, and the best results are obtained using co-training alone. This study shows that sentence segmentation is very appropriate for multi-view learning since the data sets can be represented by two disjoint and redundantly sufficient feature sets, namely, using lexical and prosodic information. Performance of the lexical and prosodic models is improved by 26% and 11% relative, respectively, when only a small set of manually labeled examples is used. When both information sources are combined, the semi-supervised learning methods improve the baseline F-Measure of 69.8% to 74.2%.

引用

页码：320 / 329

页数：10

共 50 条

[41] Semi-supervised Multi-view Manifold Discriminant Intact Space Learning
Han, Lu
Wu, Fei
Jing, Xiao-Yuan
[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (09): : 4317 - 4335
[42] Semi-Supervised Learning and Feature Fusion for Multi-view Data Clustering
Salman, Hadi
Zhan, Justin
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 645 - 650
[43] Human Action Recognition Based on Multi-view Semi-supervised Learning
Tang, Chao
Wang, Wenjian
Wang, Xiaofeng
Zhang, Chen
Zou, Le
[J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (04): : 376 - 384
[44] Multi-view classification with semi-supervised learning for SAR target recognition
Zhang, Yukun
Guo, Xiansheng
Ren, Haohao
Li, Lin
[J]. SIGNAL PROCESSING, 2021, 183
[45] Semi-supervised Unified Latent Factor learning with multi-view data
Yu Jiang
Jing Liu
Zechao Li
Hanqing Lu
[J]. Machine Vision and Applications, 2014, 25 : 1635 - 1645
[46] Reducing the Unlabeled Sample Complexity of Semi-Supervised Multi-view Learning
Lan, Chao
Huan, Jun
[J]. KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 627 - 634
[47] Latent multi-view semi-supervised classification by using graph learning
Huang, Yanquan
Yuan, Haoliang
Lai, Loi Lei
[J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2020, 18 (05)
[48] Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition
Cui, Xiaodong
Huang, Jing
Chien, Jen-Tzung
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (07): : 1923 - 1935
[49] Interpretable Graph Convolutional Network for Multi-View Semi-Supervised Learning
Wu, Zhihao
Lin, Xincan
Lin, Zhenghong
Chen, Zhaoliang
Bai, Yang
Wang, Shiping
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8593 - 8606
[50] Efficient multi-view semi-supervised feature selection
Zhang, Chenglong
Jiang, Bingbing
Wang, Zidong
Yang, Jie
Lu, Yangfeng
Wu, Xingyu
Sheng, Weiguo
[J]. INFORMATION SCIENCES, 2023, 649

← 1 2 3 4 5 →