Speech Rate Calculations with Short Utterances: A Study from a Speech-to-Speech, Machine Translation Mediated Map Task

被引：0

作者：

Akira, Hayakawa ^{[1
]}

Vogel, Carl ^{[1
]}

Luz, Saturnino ^{[2
]}

Campbell, Nick ^{[1
]}

机构：

[1] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin, Ireland

[2] Univ Edinburgh, Usher Inst Populat Hlth Sci & Informat, Edinburgh, Midlothian, Scotland

来源：

PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018) | 2018年

基金：

爱尔兰科学基金会;

关键词：

speech rate; utterance duration comparison; task oriented dialogues;

D O I：

暂无

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The motivation for this paper is to present a way to verify if an utterance within a corpus is pronounced at a fast or slow pace. An alternative method to the well-known Word-Per-Minute (wpm) method for cases where this approach is not applicable. For long segmentations, such as the full introduction section of a speech or presentation, the measurement of wpm is a viable option. For short comparisons of the same single word or multiple syllables, Syllables-Per-Second (sps) is also a viable option. However, when there are multiple short utterances that are frequent in task oriented dialogues or natural free flowing conversation, such as those of the direct Human-to-Human dialogues of the HCRC Map Task corpus or the computer mediated inter-lingual dialogues of the ILMT-s2s corpus, it becomes difficult to obtain a meaningful value for the utterance speech rate. In this paper we explain the method used to provide a alternative speech rate value to the utterance of the ILMT-s2s corpus and the HCRC Map Task corpus.

引用

页码：3176 / 3183

页数：8

共 40 条

[21] Language model adaptation in machine translation from speech
Bulyko, Ivan
Matsoukas, Spyros
Schwartz, Richard
Nguyen, Long
Makhoul, John
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 117 - +
[22] NAIST's Machine Translation Systems for IWSLT 2020 Conversational Speech Translation Task
Fukuda, Ryo
Sudoh, Katsuhito
Nakamura, Satoshi
[J]. 17TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION (IWSLT 2020), 2020, : 172 - 177
[23] Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task
Tang, Yun
Pino, Juan
Li, Xian
Wang, Changhan
Genzel, Dmitriy
[J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4252 - 4261
[24] Analysing fundamental frequency contours and local speech rate in map task dialogs
Mixdorff, H
Pfitzinger, HR
[J]. SPEECH COMMUNICATION, 2005, 46 (3-4) : 310 - 325
[25] Enabling effective design of multimodal interfaces for speech-to-speech translation system: An empirical study of longitudinal user behaviors over time and user strategies for coping with errors
Shin, JongHo
Georgiou, Panayiotis G.
Narayanan, Shrikanth
[J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 554 - 571
[26] Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation
Deng, Pan
Zhang, Jie
Zhou, Xinyuan
Ye, Zhongyi
Zhang, Weitai
Cui, Jianwei
Dai, Lirong
[J]. 2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 954 - 959
[27] Emotion Recognition from Speech Utterances with Hybrid Spectral Features Using Machine Learning Algorithms
Raghu, Kogila
Sadanandam, Manchala
[J]. TRAITEMENT DU SIGNAL, 2022, 39 (02) : 603 - 609
[28] The Distractor Picture Paradox in Speech Production: Evidence from the Word Translation Task
Navarrete, Eduardo
Costa, Albert
[J]. JOURNAL OF PSYCHOLINGUISTIC RESEARCH, 2009, 38 (06) : 527 - 547
[29] The Distractor Picture Paradox in Speech Production: Evidence from the Word Translation Task
Eduardo Navarrete
Albert Costa
[J]. Journal of Psycholinguistic Research, 2009, 38 : 527 - 547
[30] Training speech translation from audio recordings of interpreter-mediated communication
Paulik, Matthias
Waibel, Alex
[J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (02): : 455 - 474

← 1 2 3 4 →