Using prosody for automatic sentence segmentation of multi-party meetings

被引：0

作者：

Kolar, Jachym ^{[1
]}

Shriberg, Elizabeth

Liu, Yang

机构：

[1] Int Comp Sci Inst, Berkeley, CA 94704 USA

[2] Univ W Bohemia, Dept Cybernet, Plzen, Czech Republic

[3] SRI Int, Menlo Pk, CA 94025 USA

[4] Univ Texas Dallas, Dallas, TX 75230 USA

来源：

TEXT, SPEECH AND DIALOGUE, PROCEEDINGS | 2006年 / 4188卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data. We examine two different approaches to boundary classification: score-level combination of independent language and prosodic models using HMMs, and feature-level combination of models using a boosting-based method (BoosTexter). We report classification results for reference word transcripts as well as for transcripts from a state-of-the-art automatic speech recognizer (ASR). We also compare results using the lexical model plus a pause-only prosody model, versus results using additional prosodic features. Results show that (1) information from pauses is important, including pause duration both at the boundary and at the previous and following word boundaries; (2) adding duration, pitch, and energy features yields significant improvement over pause alone; (3) the integrated boosting-based model performs better than the HMM for ASR conditions; (4) training the boosting-based model on recognized words yields further improvement.

引用

页码：629 / 636

页数：8

共 50 条

[31] A Multi-party Conversational Social Robot Using LLMs
Addlesee, Angus
Cherakara, Neeraj
Nelson, Nivan
Garcia, Daniel Hernandez
Gunson, Nancie
Sieinska, Weronika
Romeo, Marta
Dondrup, Christian
Lemon, Oliver
COMPANION OF THE 2024 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, HRI 2024 COMPANION, 2024, : 1273 - 1275
[32] Analyzing Mouth-Opening Transition Pattern for Predicting Next Speaker in Multi-party Meetings
Ishii, Ryo
Kumano, Shiro
Otsuka, Kazuhiro
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 209 - 216
[33] Multi-party energy management for smart building cluster with PV systems using automatic demand response
Ma, Li
Liu, Nian
Wang, Lingfeng
Zhang, Jianhua
Lei, Jinyong
Zeng, Zheng
Wang, Cheng
Cheng, Minyang
ENERGY AND BUILDINGS, 2016, 121 : 11 - 21
[34] MODELING VOCAL INTERACTION FOR TEXT-INDEPENDENT DETECTION OF INVOLVEMENT HOTSPOTS IN MULTI-PARTY MEETINGS
Laskowski, Kornel
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 81 - 84
[35] Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Sklyar, Ilya
Piunova, Anna
Osendorfer, Christian
INTERSPEECH 2022, 2022, : 4451 - 4455
[36] Multi-party concurrent signatures
Tonien, Dongvu
Susilo, Willy
Safavi-Naini, Reihaneh
INFORMATION SECURITY, PROCEEDINGS, 2006, 4176 : 131 - 145
[37] Multi-party bidirectional teleportation
Seida, C.
El Allati, A.
Metwally, N.
Hassouni, Y.
OPTIK, 2021, 247
[38] Multi-Party Quantum Steganography
Takashi Mihara
International Journal of Theoretical Physics, 2017, 56 : 576 - 583
[39] ANALYSIS AND MODELING OF NEXT SPEAKING START TIMING BASED ON GAZE BEHAVIOR IN MULTI-PARTY MEETINGS
Ishii, Ryo
Otsuka, Kazuhiro
Kumano, Shiro
Yamato, Junji
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[40] Multi-party focus of attention recognition in meetings from head pose and multimodal contextual cues
Ba, Sileye O.
Odobez, Jean-Marc
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2221 - +

← 1 2 3 4 5 →