The Corpus of American Danish: a language resource of spoken immigrant Danish in North and South America

被引：2

作者：

Kuhl, Karoline ^{[1
]}

Petersen, Jan Heegard ^{[1
]}

Hansen, Gert Foget ^{[1
]}

机构：

[1] Univ Copenhagen, Dept Nord Studies & Linguist, Emil Holms Kanal 2, DK-2300 Copenhagen, Denmark

来源：

LANGUAGE RESOURCES AND EVALUATION | 2020年 / 54卷 / 03期

关键词：

Corpus documentation; Spoken language resource; Validation procedures; Heritage language; Danish; Multilingual spoken language; Language contact;

D O I：

10.1007/s10579-019-09473-5

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

This paper describes the 'Corpus of American Danish' (CoAmDa), a newly established corpus of spoken immigrant Danish in North and South America. The CoAmDa amounts to approx. 1.7 million tokens, making it one of the largest corpora of heritage language at present. With regard to text type, the CoAmDa is a non-standard multilingual spoken language resource as Danish is mixed with American English, Canadian English or Argentine Spanish, respectively, in every recording. The aim of this note is to document relevant aspects and specifications of the CoAmDA, viz. the audio data, the sociodemographic metadata of the speakers, the digitization process of analog data, the transcription procedures, the format and tagging of the speech files and the internal validation procedures. In so doing, we wish to share our experience and best practices with regard to achieving a spoken language resource of high quality with the interested public, in particular other researchers working on and with multilingual speech corpora.

引用

页码：831 / 849

页数：19

共 50 条

[1] The Corpus of American Danish: a language resource of spoken immigrant Danish in North and South America
Karoline Kühl
Jan Heegård Petersen
Gert Foget Hansen
[J]. Language Resources and Evaluation, 2020, 54 : 831 - 849
[2] LANGUAGE IN THE NORTH - DANISH - HAMBURGER,A
HENRIKSEN, C
[J]. SCANDINAVIAN STUDIES, 1975, 47 (02) : 266 - 268
[3] Object Shift in spoken Mainland Scandinavian: A corpus study of Danish, Norwegian, and Swedish
Bentzen, Kristine
Anderssen, Merete
Waldmann, Christian
[J]. NORDIC JOURNAL OF LINGUISTICS, 2013, 36 (02) : 115 - 151
[4] A guide to the North American collections of the Danish Emigration Archives
Haslam, GM
[J]. SCANDINAVIAN STUDIES, 1998, 70 (03) : 397 - 398
[5] LANGUAGE IN THE NORTH 1973 - YEARBOOK OF THE SCANDINAVIAN LANGUAGE COUNCILS - DANISH - HAMBURGER,A
HENRIKSEN, C
[J]. SCANDINAVIAN STUDIES, 1975, 47 (01) : 95 - 96
[6] The passive voice in spoken and written Danish, Norwegian and Swedish A comparative corpus-based study
Laanemets, Anu
[J]. LANGUAGES IN CONTRAST, 2013, 13 (01) : 67 - 89
[7] New Denmark, Canada: An exceptional case of language maintenance in a Danish immigrant settlement
Kuehl, Karoline
[J]. JOURNAL OF HISTORICAL SOCIOLINGUISTICS, 2019, 5 (01)
[8] Light from the North: The Danish Folk High Schools; Their Meanings for America
McAfee, Mildred H.
[J]. AMERICAN JOURNAL OF SOCIOLOGY, 1927, 33 (01) : 150 - 150
[9] LIGHT FROM THE NORTH. The Danish Folk Highschools: Their Meanings for America
不详
[J]. VOCATIONAL GUIDANCE MAGAZINE, 1927, 5 (06): : 286 - 287
[10] South American camelids in North America
Kennel, AJ
[J]. PROGRESS IN SOUTH AMERICAN CAMELIDS RESEARCH, 2001, (105): : 172 - 174

← 1 2 3 4 5 →