Bridging social media via distant supervision

被引:2
|
作者
Magdy, Walid [1 ]
Sajjad, Hassan [1 ]
El-Ganainy, Tarek [1 ]
Sebastiani, Fabrizio [1 ]
机构
[1] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Doha, Qatar
关键词
Twitter; YouTube; Tweet classification; Distant supervision;
D O I
10.1007/s13278-015-0275-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microblog classification has received a lot of attention in recent years. Different classification tasks have been investigated, most of them focusing on classifying microblogs into a small number of classes (five or less) using a training set of manually annotated tweets. Unfortunately, labelling data is tedious and expensive, and finding tweets that cover all the classes of interest is not always straightforward, especially when some of the classes do not frequently arise in practice. In this paper, we study an approach to tweet classification based on distant supervision, whereby we automatically transfer labels from one social medium to another for a single-label multi-class classification task. In particular, we apply YouTube video classes to tweets linking to these videos. This provides for free a virtually unlimited number of labelled instances that can be used as training data. The classification experiments we have run show that training a tweet classifier via these automatically labelled data achieves substantially better performance than training the same classifier with a limited amount of manually labelled data; this is advantageous, given that the automatically labelled data come at no cost. Further investigation of our approach shows its robustness when applied with different numbers of classes and across different languages.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [21] Distant-Supervision of Heterogeneous Multitask Learning for Social Event Forecasting with Multilingual Indicators
    Zhao, Liang
    Wang, Junxiang
    Guo, Xiaojie
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4498 - 4505
  • [22] Label-Free Distant Supervision for Relation Extraction via Knowledge Graph Embedding
    Wang, Guanying
    Zhang, Wen
    Wang, Ruoxu
    Zhou, Yalin
    Chen, Xi
    Zhang, Wei
    Zhu, Hai
    Chen, Huajun
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2246 - 2255
  • [23] Social media among distant spouses in South Western Nigeria
    Akanle, Olayinka
    Nwanagu, God'sgift Chinenye
    Akanle, Olufunmilola Esther
    AFRICAN JOURNAL OF SCIENCE TECHNOLOGY INNOVATION & DEVELOPMENT, 2021, 13 (03): : 347 - 355
  • [24] Website replica detection with distant supervision
    Cristiano Carvalho
    Edleno Silva de Moura
    Adriano Veloso
    Nivio Ziviani
    Information Retrieval Journal, 2018, 21 : 253 - 272
  • [25] Distant Supervision for Chinese Temporal Tagging
    Zhang, Hualong
    Liu, Liting
    Cheng, Shuzhi
    Shi, Wenxuan
    KNOWLEDGE GRAPH AND SEMANTIC COMPUTING: KNOWLEDGE COMPUTING AND LANGUAGE UNDERSTANDING (CCKS 2018), 2019, 957 : 14 - 27
  • [26] Revisiting Distant Supervision for Relation Extraction
    Jiang, Tingsong
    Liu, Jing
    Lin, Chin-Yew
    Sui, Zhifang
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3580 - 3585
  • [27] Distant supervision for medical concept normalization
    Pattisapu, Nikhil
    Anand, Vivek
    Patil, Sangameshwar
    Palshikar, Girish
    Varma, Vasudeva
    JOURNAL OF BIOMEDICAL INFORMATICS, 2020, 109
  • [28] Factoid Question Answering with Distant Supervision
    Zhang, Hongzhi
    Liang, Xiao
    Xu, Guangluan
    Fu, Kun
    Li, Feng
    Huang, Tinglei
    ENTROPY, 2018, 20 (06)
  • [29] Distant Supervision Learning of DBPedia Relations
    Zajac, Marcin
    Przepiorkowski, Adam
    TEXT, SPEECH, AND DIALOGUE, TSD 2013, 2013, 8082 : 193 - 200
  • [30] Global Distant Supervision for Relation Extraction
    Han, Xianpei
    Sun, Le
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2950 - 2956