Analysis of Web Browsing Data: A Guide

被引:1
|
作者
von Hohenberg, Bernhard Clemm [1 ,10 ]
Stier, Sebastian [2 ,8 ]
Cardenal, Ana S. [3 ]
Guess, Andrew M. [4 ,5 ]
Menchen-Trevino, Ericka [6 ]
Wojcieszak, Magdalena [7 ,9 ]
机构
[1] GESIS Leibniz Inst Social Sci, Cologne, Germany
[2] GESIS Leibniz Inst Social Sci, Computat Social Sci Dept, Cologne, Germany
[3] Univ Oberta Catalunya, Barcelona, Spain
[4] Princeton Univ, Polit & Publ Affairs, Princeton, NJ USA
[5] Princeton Univ, Ctr Informat Technol Policy, Princeton, NJ USA
[6] Amer Univ, Washington, DC USA
[7] Univ Calif Davis, Davis, CA USA
[8] Univ Mannheim, Sch Social Sci, Mannheim, Germany
[9] Univ Amsterdam, Amsterdam Sch Commun Res, Amsterdam, Netherlands
[10] GESIS Leibniz Inst SocialSciences, Dept Computat Social Sci, D-50667 Cologne, Germany
基金
欧洲研究理事会;
关键词
web browsing data; digital trace data; web tracking data; computational social science; ONLINE; NEWS;
D O I
10.1177/08944393241227868
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The use of individual-level browsing data, that is, the records of a person's visits to online content through a desktop or mobile browser, is of increasing importance for social scientists. Browsing data have characteristics that raise many questions for statistical analysis, yet to date, little hands-on guidance on how to handle them exists. Reviewing extant research, and exploring data sets collected by our four research teams spanning seven countries and several years, with over 14,000 participants and 360 million web visits, we derive recommendations along four steps: preprocessing the raw data; filtering out observations; classifying web visits; and modelling browsing behavior. The recommendations we formulate aim to foster best practices in the field, which so far has paid little attention to justifying the many decisions researchers need to take when analyzing web browsing data.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Web architecture for the remote browsing and analysis of distributed medical images and data
    Masseroli, M
    Pinciroli, F
    [J]. MEDINFO 2001: PROCEEDINGS OF THE 10TH WORLD CONGRESS ON MEDICAL INFORMATICS, PTS 1 AND 2, 2001, 84 : 43 - 47
  • [2] AUTOMATIC MAINTENANCE OF WEB DIRECTORIES BY MINING WEB BROWSING DATA
    Hurtado, Carlos
    Mendoza, Marcelo
    [J]. JOURNAL OF WEB ENGINEERING, 2011, 10 (02): : 153 - 173
  • [3] Web usage mining with intentional browsing data
    Tao, Yu-Hu
    Hong, Tzung-Pe
    Su, Yu-Ming
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (03) : 1893 - 1904
  • [4] Analysis of an anonymity network for web browsing
    Rennhard, M
    Rafaeli, S
    Mathy, L
    Plattner, B
    Hutchison, D
    [J]. WET ICE 2002: ELEVENTH IEEE INTERNATIONAL WORKSHOPS ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES, PROCEEDINGS, 2002, : 49 - 54
  • [5] Browsing Unicity: On the Limits of Anonymizing Web Tracking Data
    Deusser, Clemens
    Passmann, Steffen
    Strufe, Thorsten
    [J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020), 2020, : 777 - 790
  • [6] Web-based searching and browsing of multimedia data
    Niblack, W
    Yue, S
    Kraft, R
    Amir, A
    Sundaresan, N
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1717 - 1720
  • [7] Scatter/Gather browsing of web service QoS data
    Farsandaj, Kian
    Ding, Chen
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2012, 28 (07): : 1145 - 1154
  • [8] Customer Preference and Latent Needs Analysis Using Data of TV Viewing and Web Browsing
    Guo, An
    Otake, Kohei
    Namatame, Takashi
    [J]. SOCIAL COMPUTING AND SOCIAL MEDIA: COMMUNICATION AND SOCIAL COMMUNITIES, SCSM 2019, PT II, 2019, 11579 : 319 - 329
  • [9] Web Browsing Behavior Analysis and Interactive Hypervideo
    Leiva, Luis A.
    Vivo, Roberto
    [J]. ACM TRANSACTIONS ON THE WEB, 2013, 7 (04)
  • [10] SILVA tree viewer: interactive web browsing of the SILVA phylogenetic guide trees
    Beccati, Alan
    Gerken, Jan
    Quast, Christian
    Yilmaz, Pelin
    Gleockner, Frank Oliver
    [J]. BMC BIOINFORMATICS, 2017, 18