Some Current Quantitative Problems in Corpus Linguistics and a Sketch of Some Solutions

被引:18
|
作者
Gries, Stefan Th [1 ]
机构
[1] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
关键词
association measures; mixed-effects; multi-level modeling; MuPDAR; token; type frequencies; variability-based neighbor clustering; CORPORA; MULTIFACTORIAL; LANGUAGE;
D O I
10.1177/1606822X14556606
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper surveys a variety of methodological problems in current quantitative corpus linguistics. Some problems discussed are from corpus linguistics in general, such as the impact that dispersion, type frequencies/entropies, and directionality (should) have on the computation of association measures as well as the impact that neglecting the sampling structure of a corpus can have on a statistical analysis. Others involve more specialized areas in which corpus-linguistic work is currently booming, such as historical linguistics and learner corpus research. For each of the problems, first ideas/pointers as to how these problems can be resolved are provided and exemplified in some detail.
引用
收藏
页码:93 / 117
页数:25
相关论文
共 50 条