Spatiotemporal Analysis of Traffic Data: Correspondence Analysis with Fuzzified Variables vs. Principal Component Analysis Using Weather and Gas Price as Extra Data

被引:0
|
作者
Loslever, Pierre [1 ,2 ]
机构
[1] Univ Polytech Hauts de France, Lab Automat Mecan & Informat Ind & Humaines, UMR CNRS 8201, Campus Mt Houy, F-59313 Valenciennes 9, France
[2] Polytech Univ Hauts France, LAMIH, Valenciennes, France
来源
NETWORKS & SPATIAL ECONOMICS | 2024年 / 24卷 / 03期
关键词
Traffic analysis; Origin-destination matrix; Correspondence analysis; Principal component analysis; Economics; Weather; Fuzzy sets; TRAVEL; PATTERNS; MATRICES; PRIVACY; DEMAND; MODEL; CITY;
D O I
10.1007/s11067-024-09624-4
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Study of large rail traffic databases presents formidable challenges for transport system specialists, more particularly while keeping both space and time factors together with the possibility of showing influencing factors related to the users and the transport network environment. To perform such a study, a bibliographic analysis in both statistics and transport revealed that geometrical methods for feature extraction and dimension reduction can be seen as suitable. Since there are several methods/options with, in principle, required input data, this article aims at comparing Principal Component Analysis (PCA) and Correspondence Analysis (CA) for traffic frequency data, both methods being actually used with such data. The procedure stands as follows. First a grand matrix is built where the rows correspond to time windows and the columns to all the possible origin-destination links. Then this large frequency matrix is studied using PCA and CA. The next part of the procedure consists in studying the effects of influencing factors with the possibility of keeping the quantitative scales with PCA or using fuzzy segmentation with CA, the corresponding data being considered as supplementary column points. The procedure is applied on a rail transport network including 10 stations (one corresponding to the airport) and one-hour time windows for 4 months, the available influencing factors being the temperature, rain level and gas price. The comparative analysis shows that CA graphical outputs are more complicated than PCA ones, but reveal more specific results, e.g. the network user behavior related to the airport, while PCA mainly opposes link clusters with low vs. high frequencies. Fuzzy windowing performed using actual and simulated data reduces the loss of information when averaging, e.g. over time, and can show non-linear relational phenomena. The possibility of displaying new traffic data in real time is also considered.
引用
收藏
页码:531 / 563
页数:33
相关论文
共 50 条
  • [1] Data Analysis Using Principal Component Analysis
    Sehgal, Shrub
    Singh, Harpreet
    Agarwal, Mohit
    Bhasker, V.
    Shantanu
    2014 INTERNATIONAL CONFERENCE ON MEDICAL IMAGING, M-HEALTH & EMERGING COMMUNICATION SYSTEMS (MEDCOM), 2015, : 45 - 48
  • [2] Feature Selection of Weather Data with Interval Principal Component Analysis
    He, Chong-Cheng
    Jeng, Jin-Tsong
    2016 INTERNATIONAL CONFERENCE ON SYSTEM SCIENCE AND ENGINEERING (ICSSE), 2016,
  • [3] Comparison of two exploratory data analysis methods for fMRI: fuzzy clustering vs. principal component analysis
    Baumgartner, R
    Ryner, L
    Richter, W
    Summers, R
    Jarmasz, M
    Somorjai, R
    MAGNETIC RESONANCE IMAGING, 2000, 18 (01) : 89 - 94
  • [4] Data envelopment analysis vs. principal component analysis: An illustrative study of economic performance of Chinese cities
    Zhu, J
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1998, 111 (01) : 50 - 61
  • [5] Principal component analysis of urban traffic characteristics and meteorological data
    Nagendra, SMS
    Khare, M
    TRANSPORTATION RESEARCH PART D-TRANSPORT AND ENVIRONMENT, 2003, 8 (04) : 285 - 297
  • [6] Gene selection for microarray data analysis using principal component analysis
    Wang, AT
    Gehan, EA
    STATISTICS IN MEDICINE, 2005, 24 (13) : 2069 - 2087
  • [7] Two types of single-peaked data: Correspondence analysis as an alternative to principal component analysis
    Polak, Marike
    Heiser, Willem J.
    de Rooij, Mark
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (08) : 3117 - 3128
  • [8] DISCARDING VARIABLES IN A PRINCIPAL COMPONENT ANALYSIS .2. REAL DATA
    JOLLIFFE, IT
    THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1973, 22 (01): : 21 - 31
  • [9] SCALING VARIABLES AND INTERPRETATION OF EIGENVALUES IN PRINCIPAL COMPONENT ANALYSIS OF GEOLOGIC DATA
    MIESCH, AT
    JOURNAL OF THE INTERNATIONAL ASSOCIATION FOR MATHEMATICAL GEOLOGY, 1980, 12 (06): : 523 - 538
  • [10] Classification of cytometry data using principal component analysis
    Venkatapathi, M
    Rajwa, B
    Gregori, GJ
    Hirleman, ED
    Robinson, JP
    CYTOMETRY PART A, 2004, 59A (01): : 88 - 88