Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language

被引:10
|
作者
Verdonik, Darinka
Rojc, Matej
Stabej, Marko
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia
[2] Univ Ljubljana, Fac Arts, Ljubljana, Slovenia
关键词
discourse markers; speech corpora; annotating; conversation; discourse analysis; speech-to-speech translation; spontaneous speech; Slovenian language;
D O I
10.1007/s10579-007-9035-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Speech-to-speech translation technology has difficulties processing elements of spontaneity in conversation. We propose a discourse marker attribute in speech corpora to help overcome some of these problems. There have already been some attempts to annotate discourse markers in speech corpora. However, as there is no consistency on what expressions count as discourse markers, we have to reconsider how to set a framework for annotating, and, in order to better understand what we gain by introducing a discourse marker category, we have to analyse their characteristics and functions in discourse. This is especially important for languages such as Slovenian where no or little research on the topic of discourse markers has been carried out. The aims of this paper are to present a scheme for annotating discourse markers based on the analysis of a corpus of telephone conversations in the tourism domain in the Slovenian language, and to give some additional arguments based on the characteristics and functions of discourse markers that confirm their special status in conversation.
引用
收藏
页码:147 / 180
页数:34
相关论文
共 50 条
  • [21] The discourse of speech-language pathology
    Ferguson, Alison
    [J]. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY, 2009, 11 (02) : 104 - 112
  • [22] Spontaneous spoken language - Syntax and discourse
    不详
    [J]. FOLIA LINGUISTICA, 1998, 32 (1-2) : 155 - 155
  • [23] Spontaneous spoken language: Syntax and discourse
    Brody, Mary Jill
    [J]. LANGUAGE IN SOCIETY, 2013, 42 (03) : 341 - 342
  • [24] Towards Prosodic Phrasing of Spontaneous and Reading Speech for Romanian Corpora
    Apopei, Vasile
    Paduraru, Otilia
    [J]. 2015 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2015,
  • [25] THE MARKERS OF THE DISCOURSE OF THE YOUTH LANGUAGE OF MADRID
    Myre Jorgensen, Annette
    Martinez Lopez, Juan A.
    [J]. REVISTA VIRTUAL DE ESTUDOS DA LINGUAGEM-REVEL, 2007, 5 (09):
  • [26] Discourse Markers in Second Language French
    Dargnat, Mathilde
    [J]. JOURNAL OF FRENCH LANGUAGE STUDIES, 2024,
  • [27] Boundary Markers in Spontaneous Hungarian Speech
    Beke, Andras
    Gosy, Maria
    Horvath, Viktoria
    [J]. HUMAN LANGUAGE TECHNOLOGY: CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, 2016, 9561 : 3 - 15
  • [28] AlpSynth - Concatenation-based speech synthesis for the Slovenian language
    Gros, JZ
    Mihelic, A
    Pavesic, N
    Zganec, M
    Gruden, S
    [J]. Proceedings ELMAR-2005, 2005, : 213 - 216
  • [29] Construction of Chinese Conversational Corpora for Spontaneous Speech Recognition and Comparative Study on the Trilingual Parallel Corpora
    Hu, Xinhui
    Isotani, Ryosuke
    Nakamura, Satoshi
    [J]. ORIENTAL COCOSDA 2009 - INTERNATIONAL CONFERENCE ON SPEECH DATABASE AND ASSESSMENTS, 2009, : 56 - 59
  • [30] Spontaneous language markers of Spanish language impairment
    Simon-Cereijido, Gabriela
    Gutierrez-Clellen, Vera F.
    [J]. APPLIED PSYCHOLINGUISTICS, 2007, 28 (02) : 317 - 339