A Python']Python library for exploratory data analysis on twitter data based on tokens and aggregated origin-destination information

被引:3
|
作者
Graff, Mario [1 ,3 ,4 ]
Moctezuma, Daniela [2 ,3 ]
Miranda-Jimenez, Sabino [1 ,3 ]
Tellez, Eric S. [1 ,3 ]
机构
[1] INFOTEC Ctr Invest & Innovac Tecnol Informac & Co, Circuito Tecnopolo 112,Fracc Tecnopolo Pocitos 2, Aguascalientes 20313, Aguascalientes, Mexico
[2] CentroGEO Ctr Invest Ciencias Informac Geoespacia, Circuito Tecnopolo Norte 117, Aguascalientes 20313, Aguascalientes, Mexico
[3] CONACyT Consejo Nacl Ciencia & Tecnol, Direcc Catedras, Insurgentes Sur 1582, Mexico City 03940, DF, Mexico
[4] Colgate Univ, Dept Comp Sci, 13 Oak Dr, Hamilton, NY 13346 USA
关键词
Twitter exploratory analysis; Mobility patterns; Open-source [!text type='Python']Python[!/text] library;
D O I
10.1016/j.cageo.2021.105012
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Data Analytics Based Origin-Destination Core Traffic Modelling
    Morales, F.
    Ruiz, M.
    Velasco, L.
    2017 19TH INTERNATIONAL CONFERENCE ON TRANSPARENT OPTICAL NETWORKS (ICTON), 2017,
  • [22] Evaluation of Origin-Destination Matrices Based on Analysis of Data on Transport Passenger Flows
    Timofeeva, Galina
    Ie, Olga
    APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS (AMEE20), 2021, 2333
  • [23] Freight Origin-Destination Estimation based on Multiple Data Sources
    Ma, Yinyi
    van Zuylen, Henk
    Kuik, Roelof
    2012 15TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2012, : 1239 - 1244
  • [24] Python']Python-Based Experimental Analysis of Machinery Assembly Data Mining
    Zhang, Kai
    Li, Guoxi
    Wu, Baozhong
    Zhang, Meng
    2ND INTERNATIONAL CONFERENCE ON SIMULATION AND MODELING METHODOLOGIES, TECHNOLOGIES AND APPLICATIONS (SMTA 2015), 2015, : 90 - 93
  • [25] Insights into the fairness of cordon pricing based on origin-destination data
    Abulibdeh, Ammar
    Andrey, Jean
    Melnik, Matthew
    JOURNAL OF TRANSPORT GEOGRAPHY, 2015, 49 : 61 - 67
  • [26] DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python']Python
    Peng, Jinglin
    Wu, Weiyuan
    Lockhart, Brandon
    Bian, Song
    Yan, Jing Nathan
    Xu, Linghao
    Chi, Zhixuan
    Rzeszotarski, Jeffrey M.
    Wang, Jiannan
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2271 - 2280
  • [27] DADApy: Distance-based analysis of data-manifolds in Python']Python
    Glielmo, Aldo
    Macocco, Iuri
    Doimo, Diego
    Carli, Matteo
    Zeni, Claudio
    Wild, Romina
    D'Errico, Maria
    Rodriguez, Alex
    Laio, Alessandro
    PATTERNS, 2022, 3 (10):
  • [28] Punctuality Analysis of Aviation Industry Based on Python']Python Data Mining Algorithm
    Miao, Xi
    2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 976 - 978
  • [29] PyCoM: a python']python library for large-scale analysis of residue-residue coevolution data
    Bibik, Philipp
    Alibai, Sabriyeh
    Pandini, Alessandro
    Dantu, Sarath Chandra
    BIOINFORMATICS, 2024, 40 (04)
  • [30] Stream-learn-open-source Python']Python library for difficult data stream batch analysis
    Ksieniewicz, P.
    Zyblewski, P.
    NEUROCOMPUTING, 2022, 478 : 11 - 21