Creating a live, public short message service corpus: the NUS SMS corpus

被引:53
|
作者
Chen, Tao [1 ]
Kan, Min-Yen [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
关键词
SMS corpus; Corpus creation; English; Chinese; Crowdsourcing; Mechanical turk; Zhubajie; SEPTEMBER; 11;
D O I
10.1007/s10579-012-9197-9
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Short Message Service (SMS) messages are short messages sent from one person to another from their mobile phones. They represent a means of personal communication that is an important communicative artifact in our current digital era. As most existing studies have used private access to SMS corpora, comparative studies using the same raw SMS data have not been possible up to now. We describe our efforts to collect a public SMS corpus to address this problem. We use a battery of methodologies to collect the corpus, paying particular attention to privacy issues to address contributors' concerns. Our live project collects new SMS message submissions, checks their quality, and adds valid messages. We release the resultant corpus as XML and as SQL dumps, along with monthly corpus statistics. We opportunistically collect as much metadata about the messages and their senders as possible, so as to enable different types of analyses. To date, we have collected more than 71,000 messages, focusing on English and Mandarin Chinese.
引用
收藏
页码:299 / 335
页数:37
相关论文
共 50 条
  • [1] Creating a live, public short message service corpus: the NUS SMS corpus
    Tao Chen
    Min-Yen Kan
    Language Resources and Evaluation, 2013, 47 : 299 - 335
  • [2] SMS: The short message service
    Brown, Jeff
    Shipman, Bill
    Vetter, Ron
    COMPUTER, 2007, 40 (12) : 106 - 110
  • [3] Intrtxtlty (Short Message Service (SMS), UK)
    Green, Jonathon
    CRITICAL QUARTERLY, 2007, 49 (03) : 124 - 128
  • [4] Questionnaire Design in Short Message Service (SMS) Surveys
    Lau, Charles Q.
    Sanders, Herschel
    Lombaard, Ansie
    FIELD METHODS, 2019, 31 (03) : 214 - 229
  • [5] EFFICACY OF ADS WITH SHORT MESSAGE SERVICE (SMS) COPY
    Sierra, Jeremy J.
    Taute, Harry A.
    Hyman, Michael R.
    Marketing Dynamism & Sustainability-Things Change, Things Stay the Same..., 2015, : 128 - 128
  • [6] Short message service (SMS): A useful communication tool for surgeons
    Sherry, E
    Colloridi, B
    Warnke, PH
    ANZ JOURNAL OF SURGERY, 2002, 72 (05) : 369 - 369
  • [7] Statistical Analysis on Large Scale Chinese Short Message Corpus and Automatic Short Message Error Correction
    Hu, Rile
    Tang, Yuezhong
    Li, Chen
    Wang, Xia
    PACLIC 22: PROCEEDINGS OF THE 22ND PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2008, : 397 - 403
  • [8] Short Message Service (SMS) Applications for Disease Prevention in Developing Countries
    Deglise, Carole
    Suggs, L. Suzanne
    Odermatt, Peter
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2012, 14 (01)
  • [9] Design and implementation of Short Message Service (SMS) based blood bank
    Krishna, G. Muddu
    Nagaraju, S.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 2, 2016, : 161 - 164
  • [10] The use of short message service (SMS) among hospitalized coronary patients
    Bergvik, Svein
    Wynn, Rolf
    GENERAL HOSPITAL PSYCHIATRY, 2012, 34 (04) : 390 - 397