Language independent Big-Data system for the prediction of user location on Twitter

被引:0
|
作者
Alonso-Lorenzo, Jaime [1 ]
Costa-Montenegro, Enrique [1 ]
Fernandez-Gavilanes, Milagros [1 ]
机构
[1] Univ Vigo, Telemat Engn Dept, Vigo, Spain
关键词
Big-data; Social networks; Twitter; User location; Natural Language Processing; Network theory;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social media interactions have become increasingly important in today's world. A survey conducted in 2014 among adult Americans found that a majority of those surveyed use at least one social media site. Twitter, in particular, serves 310 million active users on a monthly basis, and thousands of tweets are published every second. The public nature of this data makes it a prime candidate for data mining. Twitter users publish 140-character long messages and have the ability to geo-tag these tweets using a variety of methods: GPS coordinates, IP geolocation and user-declared location. However, few users disclose their location, only between 1% and 3% of users provide location data, according to our empirical findings. In this article, we aim to aggregate information from different sources to provide an estimation on the location of any Twitter user. We use an hybrid approach, using techniques in the fields of Natural Language Processing and network theory. Tests have been conducted on two datasets, inferring the location of each individual user and then comparing it against the actual known location of users with geolocation information. The estimation error is the distance in kilometers between the estimation and the actual location. Furthermore, there is a comparison of the relative average error per country, to account for difference in country sizes. Our results improve those presented in different researches in the literature. Our research has as feature to be independent of the language used by the user, while most of works in the literature use just one language or a reduced set of languages. The article also showcases the evolution of our estimation approach and the impact that the modifications had on the results.
引用
收藏
页码:2437 / 2446
页数:10
相关论文
共 50 条
  • [21] ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance
    Krishna, Rahul
    Tang, Chong
    Sullivan, Kevin
    Ray, Baishakhi
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (03) : 893 - 909
  • [22] The Hiperwall tiled-display wall system for Big-Data research
    Saleem, Muhammad
    Valle, Hugo E.
    Brown, Stephen
    Winters, Veronica, I
    Mahmood, Akhtar
    [J]. JOURNAL OF BIG DATA, 2018, 5 (01)
  • [23] Design of publish/subscribe system for big-data security transmission of spacecraft
    Qin R.
    Peng X.
    Xie W.
    Hui J.
    Feng W.
    Jiang J.
    [J]. Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2024, 46 (03): : 963 - 971
  • [24] CuraEx - Clinical Expert System Using Big-Data for Precision Medicine
    Adhil, Mohamood
    Gandham, Santhosh
    Talukder, Asoke K.
    Agarwal, Mahima
    Achutharao, Prahalad
    [J]. BIG DATA ANALYTICS, BDA 2015, 2015, 9498 : 216 - 227
  • [25] A New Evaluation System for Scholars and Majors Based on Big-Data Techniques
    Yu, Wenhua
    Zhao, Lei
    He, Xiangyu
    Zhou, Jiacheng
    Cheng, Tong
    Xue, Chengzhao
    Yang, Fan
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1477 - 1480
  • [26] Traffic Flow Prediction Based on the location of Big Data
    Zhang, Xijun
    Yuan, Zhanting
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND TRANSPORTATION 2015, 2016, 30 : 1221 - 1225
  • [27] Security Attack Prediction Based on User Sentiment Analysis of Twitter Data
    Hernandez, Aldo
    Sanchez, Victor
    Sanchez, Gabriel
    Perez, Hector
    Olivares, Jesus
    Toscano, Karina
    Nakano, Mariko
    Martinez, Victor
    [J]. PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2016, : 610 - 617
  • [28] Demand Prediction in the Automobile Industry Independent of Big Data
    Kato T.
    [J]. Annals of Data Science, 2022, 9 (02) : 249 - 270
  • [29] Location recommendation system using big data
    Lee, Ki-Young
    Kang, Jeong-Jin
    Ahn, Hye-Kyoung
    Kim, Kyu-Ho
    Choi, Gyoo-Seok
    Choi, Sung-Jai
    Oh, Sun-Jin
    [J]. International Journal of Multimedia and Ubiquitous Engineering, 2014, 9 (05): : 317 - 325
  • [30] Sentiment Analysis of Twitter Data within Big Data Distributed Environment for Stock Prediction
    Skuza, Michal
    Romanowski, Andrzej
    [J]. PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 : 1349 - 1354