Bridging social media via distant supervision

被引:2
|
作者
Magdy, Walid [1 ]
Sajjad, Hassan [1 ]
El-Ganainy, Tarek [1 ]
Sebastiani, Fabrizio [1 ]
机构
[1] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Doha, Qatar
关键词
Twitter; YouTube; Tweet classification; Distant supervision;
D O I
10.1007/s13278-015-0275-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microblog classification has received a lot of attention in recent years. Different classification tasks have been investigated, most of them focusing on classifying microblogs into a small number of classes (five or less) using a training set of manually annotated tweets. Unfortunately, labelling data is tedious and expensive, and finding tweets that cover all the classes of interest is not always straightforward, especially when some of the classes do not frequently arise in practice. In this paper, we study an approach to tweet classification based on distant supervision, whereby we automatically transfer labels from one social medium to another for a single-label multi-class classification task. In particular, we apply YouTube video classes to tweets linking to these videos. This provides for free a virtually unlimited number of labelled instances that can be used as training data. The classification experiments we have run show that training a tweet classifier via these automatically labelled data achieves substantially better performance than training the same classifier with a limited amount of manually labelled data; this is advantageous, given that the automatically labelled data come at no cost. Further investigation of our approach shows its robustness when applied with different numbers of classes and across different languages.
引用
收藏
页码:1 / 12
页数:12
相关论文
共 50 条
  • [1] Distant Supervision for Mental Health Management in Social Media: Suicide Risk Classification System Development Study
    Fu, Guanghui
    Song, Changwei
    Li, Jianqiang
    Ma, Yue
    Chen, Pan
    Wang, Ruiqian
    Yang, Bing Xiang
    Huang, Zhisheng
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (08)
  • [2] Distant Supervision for Relation Extraction via Sparse Representation
    Zeng, Daojian
    Lai, Siwei
    Wang, Xuepeng
    Liu, Kang
    Zhao, Jun
    Lv, Xueqiang
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 : 151 - 162
  • [4] Distant Supervision for Relation Extraction via Group Selection
    Xiang, Yang
    Wang, Xiaolong
    Zhang, Yaoyun
    Qin, Yang
    Fan, Shixi
    NEURAL INFORMATION PROCESSING, PT II, 2015, 9490 : 250 - 258
  • [5] Distant Supervision via Prototype-Based Global Representation Learning
    Han, Xianpei
    Sun, Le
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3443 - 3449
  • [6] Dataset Construction via Attention for Aspect Term Extraction with Distant Supervision
    Giannakopoulos, Athanasios
    Antognini, Diego
    Musat, Claudiu
    Hossmann, Andreea
    Baeriswyl, Michael
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 373 - 380
  • [7] Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
    Qin, Pengda
    Xu, Weiran
    Wang, William Yang
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2137 - 2147
  • [8] Distant Supervision Relation Extraction via adaptive dependency-path and additional knowledge graph supervision
    Shi, Yong
    Xiao, Yang
    Quan, Pei
    Lei, MingLong
    Niu, Lingfeng
    NEURAL NETWORKS, 2021, 134 : 42 - 53
  • [9] The Key to Social Media Implementation: Bridging Customer Relationship Management to Social Media
    Mousavi, Seyedreza
    Demirkan, Haluk
    PROCEEDINGS OF THE 46TH ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2013, : 718 - 727
  • [10] Bridging Models for Popularity Prediction on Social Media
    Mishra, Swapnil
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 810 - 811