Automated Extraction of Structured Data from the Social Network Instagram

被引:0
|
作者
Frantis, Petr [1 ]
Bures, Michel [1 ]
Coufalikova, Aneta [1 ]
Klaban, Ivo [1 ]
机构
[1] Univ Def, Fac Mil Technol, Dept Informat & Cyber Operat, Brno, Czech Republic
关键词
Instagram; Profiling; Instagram Private API; Automation; Osintgram; !text type='Python']Python[!/text;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper explores the extraction of structured information from the social network Instagram through a suitable application programming interface, namely the unofficial Instagram Private API. It focuses on creating a computer program that identifies which posts a user has tagged as "Likes" and then stores this information for profiling specific user profiles. The introduction of the paper highlights the general use of social media in modern society and the importance of personal data for these platforms. It specifies the aim of the study, which is to extract information from Instagram and then analyse it for user profiling. It then describes the evolution of the social network Instagram and key features such as different types of posts. This paper further focuses on the solution and implementation by using Python programming language to minimize the load on Instagram servers and reduce the risk of detection of automated processes. It describes the process of setting up new Instagram accounts, the obstacles in obtaining login credentials, and the need to simulate human behaviour to bypass the network's defence mechanisms. It then focuses on the actual retrieval of information such as the users followed, their posts and information about which posts the user has marked as favourites. It mentions that extracting data from closed profiles is difficult and elaborates on the technical challenges associated with this task. A significant part of this paper is a discussion of Instagram's defence mechanisms that respond to automated computer programs. It describes access denial, account blocking, and identity verification prompts such as CAPTCHA tests. Finally, the conclusion summarizes the results obtained, which indicate the acquisition of approximately 90,000 records for user profiling. It discusses the shortcomings of a fully automated solution due to Instagram's account creation conditions and defence mechanisms. It mentions the need for further research and highlights key gaps and challenges in this area. Overall, the study highlights the technical and security challenges in extracting information from Instagram and emphasises the need for further research and improvements in the technical procedures for extracting data from the platform.
引用
收藏
页码:157 / 164
页数:8
相关论文
共 50 条
  • [1] Automated Extraction of Structured Data from Text Notes in the Electronic Medical Record
    Alexander C. Flint
    Ronald B. Melles
    Jeff G. Klingman
    Sheila L. Chan
    Vivek A. Rao
    Andrew L. Avins
    Journal of General Internal Medicine, 2021, 36 : 2880 - 2882
  • [2] Automated Extraction of Structured Data from Text Notes in the Electronic Medical Record
    C. Flint, Alexander
    Melles, Ronald B.
    Klingman, Jeff G.
    Chan, Sheila L.
    Rao, Vivek A.
    Avins, Andrew L.
    JOURNAL OF GENERAL INTERNAL MEDICINE, 2021, 36 (09) : 2880 - 2882
  • [3] Structured Data Extraction from Emails
    Mahlawi, Ashraf Q.
    Sasi, Sreela
    2017 INTERNATIONAL CONFERENCE ON NETWORKS & ADVANCES IN COMPUTATIONAL TECHNOLOGIES (NETACT), 2017, : 323 - 328
  • [4] Social Semiotic Aspects of Instagram Social Network
    Mirsarraf, Mohammadreza
    Shairi, Hamidreza
    Ahmadpanah, Abotorab
    2017 IEEE INTERNATIONAL CONFERENCE ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2017, : 460 - 465
  • [5] AUTOMATED EXTRACTION OF DRAINAGE NETWORK AND WATERSHED DATA FROM DIGITAL ELEVATION MODELS
    MARTZ, LW
    GARBRECHT, J
    WATER RESOURCES BULLETIN, 1993, 29 (06): : 901 - 908
  • [6] INFLUENCER COMMUNICATION ON THE SOCIAL NETWORK INSTAGRAM
    Kalinova, Eva
    Neubergova, Adela
    AD ALTA-JOURNAL OF INTERDISCIPLINARY RESEARCH, 2021, 11 (02): : 107 - 111
  • [7] INSTAGRAM NETWORK AS PART OF THE SOCIAL ENVIRONMENT
    Kasheev, Oleg, V
    Golovko, Valeria Ya
    VESTNIK SLAVIANSKIKH KULTUR-BULLETIN OF SLAVIC CULTURES-SCIENTIFIC AND INFORMATIONAL JOURNAL, 2019, 52 : 83 - 91
  • [8] Automated event and social network extraction from digital evidence sources with ontological mapping
    Turnbull, Benjamin
    Randhawa, Suneel
    DIGITAL INVESTIGATION, 2015, 13 : 94 - 106
  • [9] Automated Content Extraction from SAR Data
    Aiazzi, B.
    Baronti, S.
    Alparone, L.
    Cuozzo, G.
    D'Elia, C.
    Schirinzi, G.
    2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 821 - +
  • [10] SOCIAL MEDIA DATA EXTRACTION USING AUTOMATED SEMANTIC TOOL
    Kee, Estelle Xin Ying
    Hong, Jer Lang
    UNCERTAINTY MODELLING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2016, 10 : 391 - 397