A Python']Python library for exploratory data analysis on twitter data based on tokens and aggregated origin-destination information

被引:3
|
作者
Graff, Mario [1 ,3 ,4 ]
Moctezuma, Daniela [2 ,3 ]
Miranda-Jimenez, Sabino [1 ,3 ]
Tellez, Eric S. [1 ,3 ]
机构
[1] INFOTEC Ctr Invest & Innovac Tecnol Informac & Co, Circuito Tecnopolo 112,Fracc Tecnopolo Pocitos 2, Aguascalientes 20313, Aguascalientes, Mexico
[2] CentroGEO Ctr Invest Ciencias Informac Geoespacia, Circuito Tecnopolo Norte 117, Aguascalientes 20313, Aguascalientes, Mexico
[3] CONACyT Consejo Nacl Ciencia & Tecnol, Direcc Catedras, Insurgentes Sur 1582, Mexico City 03940, DF, Mexico
[4] Colgate Univ, Dept Comp Sci, 13 Oak Dr, Hamilton, NY 13346 USA
关键词
Twitter exploratory analysis; Mobility patterns; Open-source [!text type='Python']Python[!/text] library;
D O I
10.1016/j.cageo.2021.105012
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Deriving fine-scale models of human mobility from aggregated origin-destination flow data
    Ciavarella, Constanze
    Ferguson, Neil M.
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (02)
  • [42] Dynamic Origin-Destination Matrix Estimation Using Probe Vehicle Data as A Priori Information
    Asmundsdottir, Runa
    Chen, Yusen
    van Zuylen, Henk J.
    TRAFFIC DATA COLLECTION AND ITS STANDARDIZATION, 2010, 144 : 89 - 108
  • [43] MSPypeline: a python']python package for streamlined data analysis of mass spectrometry-based proteomics
    Heming, Simon
    Hansen, Pauline
    Vlasov, Artyom
    Schwoerer, Florian
    Schaumann, Stephen
    Frolovaite, Paulina
    Lehmann, Wolf-Dieter
    Timmer, Jens
    Schilling, Marcel
    Helm, Barbara
    Klingmueller, Ursula
    Bateman, Alex
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [44] Pyteomics - A python framework for exploratory data analysis and rapid software prototyping in proteomics
    Gorshkov, M.V. (gorshkov@chph.ras.ru), 1600, Springer Science and Business Media, LLC (24):
  • [45] lasertram: A Python']Python library for time resolved analysis of laser ablation inductively coupled plasma mass spectrometry data
    Lubbers, Jordan
    Kent, Adam J. R.
    Russo, Chris
    APPLIED COMPUTING AND GEOSCIENCES, 2025, 25
  • [46] PyWindAM: A Python']Python software for wind field analysis and cloud-based data management
    Chen, Nanxi
    Ma, Rujin
    Ge, Baixue
    Chang, Haocheng
    SOFTWAREX, 2024, 28
  • [47] Picasso: A sparse learning library for high dimensional data analysis in R and python
    Ge, Jason
    Li, Xingguo
    Jiang, Haoming
    Liu, Han
    Zhang, Tong
    Wang, Mengdi
    Zhao, Tuo
    Journal of Machine Learning Research, 2019, 20
  • [48] Estimating smart card commuters origin-destination distribution based on APTS data
    Chen, J. (chenjuntom@126.com), 1600, Science Press (13):
  • [49] Exploring the Evolutionary Patterns of Urban Activity Areas Based on Origin-Destination Data
    Shi, Xiaoying
    Lv, Fanshun
    Seng, Dewen
    Xing, Baixi
    Chen, Jing
    IEEE ACCESS, 2019, 7 : 20416 - 20431
  • [50] Vulnerability Analysis of Highway Traffic Networks Using Origin-destination Tollgate Data
    Fang, Shi
    Bian, Kaigui
    Xie, Kunqing
    2016 IEEE 19TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2016, : 1957 - 1963