Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language

被引:10
|
作者
Verdonik, Darinka
Rojc, Matej
Stabej, Marko
机构
[1] Univ Maribor, Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia
[2] Univ Ljubljana, Fac Arts, Ljubljana, Slovenia
关键词
discourse markers; speech corpora; annotating; conversation; discourse analysis; speech-to-speech translation; spontaneous speech; Slovenian language;
D O I
10.1007/s10579-007-9035-7
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Speech-to-speech translation technology has difficulties processing elements of spontaneity in conversation. We propose a discourse marker attribute in speech corpora to help overcome some of these problems. There have already been some attempts to annotate discourse markers in speech corpora. However, as there is no consistency on what expressions count as discourse markers, we have to reconsider how to set a framework for annotating, and, in order to better understand what we gain by introducing a discourse marker category, we have to analyse their characteristics and functions in discourse. This is especially important for languages such as Slovenian where no or little research on the topic of discourse markers has been carried out. The aims of this paper are to present a scheme for annotating discourse markers based on the analysis of a corpus of telephone conversations in the tourism domain in the Slovenian language, and to give some additional arguments based on the characteristics and functions of discourse markers that confirm their special status in conversation.
引用
收藏
页码:147 / 180
页数:34
相关论文
共 50 条
  • [31] DETERMINATION OF INTERLOCUTORS' SPEECH INTENTIONS BY DISCOURSE MARKERS
    Pavlova, Natalya D.
    Afinogenova, Victoriya A.
    Kubrak, Tina A.
    [J]. EKSPERIMENTALNAYA PSIKHOLOGIYA, 2023, 16 (04): : 157 - 171
  • [32] The Potential of spoken Corpora for Language Didactics. The Example GeWiss
    Fandrych, Christian
    Meissner, Cordula
    Wallner, Franziska
    [J]. DEUTSCH ALS FREMDSPRACHE-ZEITSCHRIFT ZUR THEORIE UND PRAXIS DES FACHES DEUTSCH ALS FREMDSPRACHE, 2018, 55 (01): : 3 - 13
  • [33] Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis
    Assimakopoulos, Stavros
    Muskat, Rebecca Vella
    van der Plas, Lonneke
    Gatt, Albert
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5088 - 5097
  • [34] Phonetically rich and balanced text and speech corpora for Arabic language
    Abushariah, Mohammad A. M.
    Ainon, Raja N.
    Zainuddin, Roziati
    Elshafei, Moustafa
    Khalifa, Othman O.
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2012, 46 (04) : 601 - 634
  • [35] Phonetically rich and balanced text and speech corpora for Arabic language
    Mohammad A. M. Abushariah
    Raja N. Ainon
    Roziati Zainuddin
    Moustafa Elshafei
    Othman O. Khalifa
    [J]. Language Resources and Evaluation, 2012, 46 : 601 - 634
  • [36] RICOEUR,PAUL, DISCOURSE BETWEEN SPEECH AND LANGUAGE
    GISEL, P
    [J]. PHILOSOPHY TODAY, 1977, 21 (04) : 446 - 456
  • [37] Thinking with language: Speech, thought and discourse and their passions
    Engel, P
    [J]. REVUE PHILOSOPHIQUE DE LA FRANCE ET DE L ETRANGER, 1999, 124 (02): : 241 - 242
  • [38] Synthesizing critical discourse analysis with language ideologies: The example of fictional discourse
    Stamou, Anastasia G.
    [J]. DISCOURSE CONTEXT & MEDIA, 2018, 23 : 80 - 89
  • [39] Discourse markers in relation to non-verbal behavior How do speech and body language correlate?
    Mlakar, Izidor
    Rojc, Matej
    Majhenic, Simona
    Verdonik, Darinka
    [J]. GESTURE, 2021, 20 (01) : 103 - 134
  • [40] Hybrid language models and spontaneous legal discourse
    Kenne, PE
    OKane, M
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 717 - 720