Automated Extraction of Structured Data from the Social Network Instagram

被引:0
|
作者
Frantis, Petr [1 ]
Bures, Michel [1 ]
Coufalikova, Aneta [1 ]
Klaban, Ivo [1 ]
机构
[1] Univ Def, Fac Mil Technol, Dept Informat & Cyber Operat, Brno, Czech Republic
关键词
Instagram; Profiling; Instagram Private API; Automation; Osintgram; !text type='Python']Python[!/text;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper explores the extraction of structured information from the social network Instagram through a suitable application programming interface, namely the unofficial Instagram Private API. It focuses on creating a computer program that identifies which posts a user has tagged as "Likes" and then stores this information for profiling specific user profiles. The introduction of the paper highlights the general use of social media in modern society and the importance of personal data for these platforms. It specifies the aim of the study, which is to extract information from Instagram and then analyse it for user profiling. It then describes the evolution of the social network Instagram and key features such as different types of posts. This paper further focuses on the solution and implementation by using Python programming language to minimize the load on Instagram servers and reduce the risk of detection of automated processes. It describes the process of setting up new Instagram accounts, the obstacles in obtaining login credentials, and the need to simulate human behaviour to bypass the network's defence mechanisms. It then focuses on the actual retrieval of information such as the users followed, their posts and information about which posts the user has marked as favourites. It mentions that extracting data from closed profiles is difficult and elaborates on the technical challenges associated with this task. A significant part of this paper is a discussion of Instagram's defence mechanisms that respond to automated computer programs. It describes access denial, account blocking, and identity verification prompts such as CAPTCHA tests. Finally, the conclusion summarizes the results obtained, which indicate the acquisition of approximately 90,000 records for user profiling. It discusses the shortcomings of a fully automated solution due to Instagram's account creation conditions and defence mechanisms. It mentions the need for further research and highlights key gaps and challenges in this area. Overall, the study highlights the technical and security challenges in extracting information from Instagram and emphasises the need for further research and improvements in the technical procedures for extracting data from the platform.
引用
收藏
页码:157 / 164
页数:8
相关论文
共 50 条
  • [41] Data extraction from semi-structured web pages by clustering
    Vuong, Le Phong Bao
    Gao, Xiaoying
    Zhang, Mengjie
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 374 - +
  • [42] Teaching use of the social network "Instagram" in Projects 1 course
    Moreno-Moreno, Maria Pura
    VI JORNADAS SOBRE INNOVACION DOCENTE EN ARQUITECTURA (JIDA'18), 2018, : 508 - 518
  • [43] NBCAL violation of products that compete with breastfeeding on the social network Instagram
    Sally, Enilce de Oliveira Fonseca
    Gomes, Daiane Silva
    Dantas, Lais de Oliveira Costa
    Henriques, Patricia
    CIENCIA & SAUDE COLETIVA, 2024, 29 (04):
  • [44] Identifying Museum Visitors via Social Network Analysis of Instagram
    Chang, Mi
    Yi, Taeha
    Hong, Sukjoo
    Lai, Po Yan
    Jun, Ji Young
    Lee, Ji-Hyun
    ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE, 2022, 15 (03):
  • [45] Automated Question Generation Tool for Structured Data
    Shirude, A.
    Totala, S.
    Nikhar, S.
    Attar, V.
    Ramanand, J.
    2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1546 - 1551
  • [46] A Prescriptive Approach For Structured Information Extraction From Web Forums And Social Media
    Cumberland, Ethan
    Day, Tony
    2021 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROLS (ISCSIC 2021), 2021, : 95 - 101
  • [47] AUTOMATIC RIVER NETWORK EXTRACTION FROM LIDAR DATA
    Maderal, E. N.
    Valcarcel, N.
    Delgado, J.
    Sevilla, C.
    Ojeda, J. C.
    XXIII ISPRS CONGRESS, COMMISSION VIII, 2016, 41 (B8): : 365 - 372
  • [48] Extraction and Analysis Social Networks from Process Data
    Kopka, Martin
    Kudelka, Milos
    Stolfa, Jakub
    Kobersky, Ondrej
    Snasel, Vaclav
    2013 FIFTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS (CASON), 2013, : 38 - 43
  • [49] Extraction of Multilayered Social Networks from Activity Data
    Musial, Katarzyna
    Brodka, Piotr
    Kazienko, Przemyslaw
    Gaworecki, Jaroslaw
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [50] Intellectual Knowledge Extraction from Online Social Data
    Rahman, Muhammad Mahbubur
    2012 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2012, : 205 - 210