Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition

被引:37
|
作者
Herdagdelen, Amac [1 ]
Marelli, Marco [2 ]
机构
[1] Facebook, 1 Hacker Way, Menlo Pk, CA 94025 USA
[2] Univ Trento, Ctr Mind Brain Sci, Trento, Italy
关键词
Frequency effects; Social media; Lexical decision; Text corpora; ACQUISITION; KUCERA; CORPUS; SIZE; AGE;
D O I
10.1111/cogs.12392
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Corpus-based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency-based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter- and Facebook-based frequencies on American English and British English datasets) and are still substantial when we control for corpus size.
引用
收藏
页码:976 / 995
页数:20
相关论文
共 14 条