Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language

被引:10
|
作者
Verdonik, Darinka
Rojc, Matej
Stabej, Marko
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia
[2] Univ Ljubljana, Fac Arts, Ljubljana, Slovenia
关键词
discourse markers; speech corpora; annotating; conversation; discourse analysis; speech-to-speech translation; spontaneous speech; Slovenian language;
D O I
10.1007/s10579-007-9035-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Speech-to-speech translation technology has difficulties processing elements of spontaneity in conversation. We propose a discourse marker attribute in speech corpora to help overcome some of these problems. There have already been some attempts to annotate discourse markers in speech corpora. However, as there is no consistency on what expressions count as discourse markers, we have to reconsider how to set a framework for annotating, and, in order to better understand what we gain by introducing a discourse marker category, we have to analyse their characteristics and functions in discourse. This is especially important for languages such as Slovenian where no or little research on the topic of discourse markers has been carried out. The aims of this paper are to present a scheme for annotating discourse markers based on the analysis of a corpus of telephone conversations in the tourism domain in the Slovenian language, and to give some additional arguments based on the characteristics and functions of discourse markers that confirm their special status in conversation.
引用
收藏
页码:147 / 180
页数:34
相关论文
共 50 条
  • [41] Discourse markers in relation to non-verbal behavior How do speech and body language correlate?
    Mlakar, Izidor
    Rojc, Matej
    Majhenic, Simona
    Verdonik, Darinka
    [J]. GESTURE, 2021, 20 (01) : 103 - 134
  • [42] Hybrid language models and spontaneous legal discourse
    Kenne, PE
    OKane, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 717 - 720
  • [43] Spontaneous spoken language. Syntax and discourse
    Yang, XZ
    [J]. WORD-JOURNAL OF THE INTERNATIONAL LINGUISTIC ASSOCIATION, 2001, 52 (03): : 506 - 510
  • [44] Spontaneous spoken language: Syntax and discourse.
    Yule, G
    [J]. LINGUA, 2000, 110 (05) : 375 - 378
  • [45] Speech/Non-Speech Detection in Malay Language Spontaneous Speech
    Izzad, M.
    Jamil, Nursuriati
    Abu Bakar, Zainab
    [J]. 2013 INTERNATIONAL CONFERENCE ON COMPUTING, MANAGEMENT AND TELECOMMUNICATIONS (COMMANTEL), 2013, : 219 - 224
  • [46] Rapid Collection of Spontaneous Speech Corpora using Telephonic Community Forums
    Raza, Agha Ali
    Athar, Awais
    Randhawa, Shan
    Tariq, Zain
    Saleem, Muhammad Bilal
    Bin Zia, Haris
    Saif, Umar
    Rosenfeld, Roni
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1021 - 1025
  • [47] Spontaneous spoken language: syntax and discourse.
    Mackenzie, JL
    [J]. JOURNAL OF LINGUISTICS, 2001, 37 (01) : 225 - 229
  • [48] Building multilingual speech corpora from interpreted spontaneous dialogues on the net
    Fafiotte, G
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2003, 2807 : 357 - 364
  • [49] Large vocabulary speech recognition of Slovenian language using morphological models
    Maucec, M
    Rotovnik, T
    Kacic, Z
    Horvat, B
    [J]. IEEE REGION 8 EUROCON 2003, VOL B, PROCEEDINGS: COMPUTER AS A TOOL, 2003, : 158 - 161
  • [50] Discourse markers: a challenge for natural language processing
    Rey, J
    [J]. AI COMMUNICATIONS, 1997, 10 (3-4) : 177 - 184