Word Segmentation Cues in German Child-Directed Speech: A Corpus Analysis

被引:13
|
作者
Stark, Katja [1 ]
Kidd, Evan [1 ,2 ,3 ]
Frost, Rebecca L. A. [1 ]
机构
[1] Max Planck Inst Psycholinguist, Language Dev Dept, Wundtlaan 1, NL-6525 XD Nijmegen, Netherlands
[2] Australian Natl Univ, Res Sch Psychol, Canberra, ACT, Australia
[3] ARC Ctr Excellence Dynam Language, Canberra, ACT, Australia
基金
澳大利亚研究理事会;
关键词
Language acquisition; speech segmentation; distributional cues; child-directed speech; German; INFANTS DISCRIMINATION; LANGUAGE; FREQUENCY; STRESS; PERCEPTION; CONSTRAINTS; STATISTICS; MORPHEMES; PATTERNS; FRENCH;
D O I
10.1177/0023830920979016
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
To acquire language, infants must learn to segment words from running speech. A significant body of experimental research shows that infants use multiple cues to do so; however, little research has comprehensively examined the distribution of such cues in naturalistic speech. We conducted a comprehensive corpus analysis of German child-directed speech (CDS) using data from the Child Language Data Exchange System (CHILDES) database, investigating the availability of word stress, transitional probabilities (TPs), and lexical and sublexical frequencies as potential cues for word segmentation. Seven hours of data (similar to 15,000 words) were coded, representing around an average day of speech to infants. The analysis revealed that for 97% of words, primary stress was carried by the initial syllable, implicating stress as a reliable cue to word onset in German CDS. Word identity was also marked by TPs between syllables, which were higher within than between words, and higher for backwards than forwards transitions. Words followed a Zipfian-like frequency distribution, and over two-thirds of words (78%) were monosyllabic. Of the 50 most frequent words, 82% were function words, which accounted for 47% of word tokens in the entire corpus. Finally, 15% of all utterances comprised single words. These results give rich novel insights into the availability of segmentation cues in German CDS, and support the possibility that infants draw on multiple converging cues to segment their input. The data, which we make openly available to the research community, will help guide future experimental investigations on this topic.
引用
收藏
页码:3 / 27
页数:25
相关论文
共 50 条
  • [1] Diminutives in child-directed speech supplement metric with distributional word segmentation cues
    Vera Kempe
    Patricia J. Brooks
    Steven Gillis
    [J]. Psychonomic Bulletin & Review, 2005, 12 : 145 - 151
  • [2] THE AVAILABILITY OF CUES FOR WORD SEGMENTATION AND VOCABULARY ACQUISITION IN CATALAN CHILD-DIRECTED SPEECH
    Feijoo, Sara
    Hilferty, Joseph
    [J]. RLA-REVISTA DE LINGUISTICA TEORICA Y APLICADA, 2013, 51 (02): : 13 - 27
  • [3] Diminutives in child-directed speech supplement metric with distributional word segmentation cues
    Kempe, V
    Brooks, PJ
    Gillis, S
    [J]. PSYCHONOMIC BULLETIN & REVIEW, 2005, 12 (01) : 145 - 151
  • [4] Harmonic cues for speech segmentation: a cross-linguistic corpus study on child-directed speech
    Ketrez, F. Nihan
    [J]. JOURNAL OF CHILD LANGUAGE, 2014, 41 (02) : 439 - 461
  • [5] Word order in German child language and child-directed speech: A corpus analysis on the ordering of double objects in the German middlefield
    Sauerman, Antje
    Hoehle, Barbara
    [J]. GLOSSA-A JOURNAL OF GENERAL LINGUISTICS, 2018, 3 (01):
  • [6] Word stress correlates in spontaneous child-directed speech in German
    Schneider, Katrin
    Moebius, Bernd
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 637 - 640
  • [7] A Phonemic Corpus of Polish Child-Directed Speech
    Boruta, Luc
    Jastrzebska, Justyna
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1017 - 1020
  • [8] Statistical speech segmentation and word learning in parallel: scaffolding from child-directed speech
    Yurovsky, Daniel
    Yu, Chen
    Smith, Linda B.
    [J]. FRONTIERS IN PSYCHOLOGY, 2012, 3
  • [9] A corpus of European Portuguese child and child-directed speech
    Santos, Ana Lucia
    Genereux, Michel
    Cardoso, Aida
    Agostinho, Celina
    Abalada, Silvana
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1488 - 1491
  • [10] An Automatically Aligned Corpus of Child-directed Speech
    Elsner, Micha
    Ito, Kiwako
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1736 - 1740