Text-Based Twitter User Geolocation Prediction

被引:155
|
作者
Han, Bo [1 ,2 ]
Cook, Paul [1 ]
Baldwin, Timothy [1 ,2 ]
机构
[1] Univ Melbourne, Melbourne, Vic 3010, Australia
[2] NICTA Victoria Res Lab, Melbourne, Vic, Australia
基金
澳大利亚研究理事会;
关键词
NETWORK;
D O I
10.1613/jair.4200
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Geographical location is vital to geospatial applications like local search and event detection. In this paper, we investigate and improve on the task of text-based geolocation prediction of Twitter users. Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its author's location. However, these references are often buried in informal, ungrammatical, and multilingual data, and are therefore non-trivial to identify and exploit. We present an integrated geolocation prediction framework and investigate what factors impact on prediction accuracy. First, we evaluate a range of feature selection methods to obtain "location indicative words". We then evaluate the impact of nongeotagged tweets, language, and user-declared metadata on geolocation prediction. In addition, we evaluate the impact of temporal variance on model generalisation, and discuss how users differ in terms of their geolocatability. We achieve state-of-the-art results for the text-based Twitter user geolocation task, and also provide the most extensive exploration of the task to date. Our findings provide valuable insights into the design of robust, practical text-based geolocation prediction systems.
引用
收藏
页码:451 / 500
页数:50
相关论文
共 50 条
  • [1] Twitter User Geolocation Using a Unified Text and Network Prediction Model
    Rahimi, Afshin
    Cohn, Trevor
    Baldwin, Timothy
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 630 - 636
  • [2] Text-based Geolocation Prediction of Social Media Users with Neural Networks
    Lourentzou, Ismini
    Morales, Alex
    Zhai, ChengXiang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 696 - 705
  • [3] Kernel Density Estimation for Text-Based Geolocation
    Hulden, Mans
    Silfverberg, Miikka
    Francom, Jerid
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 145 - 150
  • [4] A Hierarchical Location Prediction Neural Network for Twitter User Geolocation
    Huang, Binxuan
    Carley, Kathleen M.
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 4732 - 4742
  • [5] Interpreting Twitter User Geolocation
    Zhong, Ting
    Wang, Tianliang
    Zhou, Fan
    Trajcevski, Goce
    Zhang, Kunpeng
    Yang, Yi
    [J]. 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 853 - 859
  • [6] Text-Based User-kNN: Measuring User Similarity Based on Text Reviews
    Terzi, Maria
    Rowe, Matthew
    Ferrario, Maria-Angela
    Whittle, Jon
    [J]. USER MODELING, ADAPTATION, AND PERSONALIZATION, UMAP 2014, 2014, 8538 : 195 - 206
  • [7] TF-MF: Improving Multiview Representation for Twitter User Geolocation Prediction
    Hamouni, Parham
    Khazaei, Taraneh
    Amjadian, Ehsan
    [J]. PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2019), 2019, : 543 - 545
  • [8] Deep Contextualized Word Embedding for Text-based Online User Profiling to Detect Social Bots on Twitter
    Heidari, Maryam
    Jones, James H. Jr Jr
    Uzuner, Ozlem
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 480 - 487
  • [9] Exploring Celebrities on Inferring User Geolocation in Twitter
    Ebrahimi, Mohammad
    ShafieiBavani, Elaheh
    Wong, Raymond
    Chen, Fang
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I, 2017, 10234 : 395 - 406
  • [10] A multilayer recognition model for twitter user geolocation
    Haina Tang
    Xiangpeng Zhao
    Yongmao Ren
    [J]. Wireless Networks, 2022, 28 : 1197 - 1202