Automated Extraction of Structured Data from the Social Network Instagram

被引:0
|
作者
Frantis, Petr [1 ]
Bures, Michel [1 ]
Coufalikova, Aneta [1 ]
Klaban, Ivo [1 ]
机构
[1] Univ Def, Fac Mil Technol, Dept Informat & Cyber Operat, Brno, Czech Republic
关键词
Instagram; Profiling; Instagram Private API; Automation; Osintgram; !text type='Python']Python[!/text;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper explores the extraction of structured information from the social network Instagram through a suitable application programming interface, namely the unofficial Instagram Private API. It focuses on creating a computer program that identifies which posts a user has tagged as "Likes" and then stores this information for profiling specific user profiles. The introduction of the paper highlights the general use of social media in modern society and the importance of personal data for these platforms. It specifies the aim of the study, which is to extract information from Instagram and then analyse it for user profiling. It then describes the evolution of the social network Instagram and key features such as different types of posts. This paper further focuses on the solution and implementation by using Python programming language to minimize the load on Instagram servers and reduce the risk of detection of automated processes. It describes the process of setting up new Instagram accounts, the obstacles in obtaining login credentials, and the need to simulate human behaviour to bypass the network's defence mechanisms. It then focuses on the actual retrieval of information such as the users followed, their posts and information about which posts the user has marked as favourites. It mentions that extracting data from closed profiles is difficult and elaborates on the technical challenges associated with this task. A significant part of this paper is a discussion of Instagram's defence mechanisms that respond to automated computer programs. It describes access denial, account blocking, and identity verification prompts such as CAPTCHA tests. Finally, the conclusion summarizes the results obtained, which indicate the acquisition of approximately 90,000 records for user profiling. It discusses the shortcomings of a fully automated solution due to Instagram's account creation conditions and defence mechanisms. It mentions the need for further research and highlights key gaps and challenges in this area. Overall, the study highlights the technical and security challenges in extracting information from Instagram and emphasises the need for further research and improvements in the technical procedures for extracting data from the platform.
引用
收藏
页码:157 / 164
页数:8
相关论文
共 50 条
  • [11] DBpedia and the live extraction of structured data from Wikipedia
    Morsey, Mohamed
    Lehmann, Jens
    Auer, Soeren
    Stadler, Claus
    Hellmann, Sebastian
    PROGRAM-ELECTRONIC LIBRARY AND INFORMATION SYSTEMS, 2012, 46 (02) : 157 - 181
  • [12] Title extraction from Loosely Structured Data Records
    Wu, Yi-Pu
    Zhang, Xue-Jie
    Li, Qing
    Chen, Jing
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 2623 - +
  • [13] Extraction of Failure Graphs from Structured and Unstructured data
    Schierle, Martin
    Trabold, Daniel
    SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 324 - 330
  • [14] Automated Extraction of Concept Matcher Thesaurus from Semi-Structured Catalogue-Like Sources of Data on the Web
    Lapaev, Maxim
    2016 18TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION AND SEMINAR ON INFORMATION SECURITY AND PROTECTION OF INFORMATION TECHNOLOGY (FRUCT-ISPIT), 2016, : 153 - 160
  • [15] Scatteract: Automated Extraction of Data from Scatter Plots
    Cliche, Mathieu
    Rosenberg, David
    Madeka, Dhruv
    Yee, Connie
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2017, PT I, 2017, 10534 : 135 - 150
  • [16] Automated Object Extraction from MLS Data: a Survey
    Chen Kunyuan
    Cheng Ming
    Zhou Menglan
    Chen Xinqu
    Chen Yifei
    Jonathan, Li
    Nie Hongshan
    2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE), 2015, : 331 - 334
  • [17] FOOD IN THE CONTEXT OF SUSTAINABILITY - INSTAGRAM SOCIAL NETWORK ANALYSIS
    Pilar, Ladislav
    Stanislavska, Lucie Kvasnickova
    Prokop, Michal
    Jaaskelainen, Pia
    AGRARIAN PERSPECTIVES XXIX: TRENDS AND CHALLENGES OF AGRARIAN SECTOR, 2020, : 272 - 278
  • [18] Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network
    Hosseinmardi, Homa
    Mattson, Sabrina Arredondo
    Ibn Rafiq, Rahat
    Han, Richard
    Lv, Qin
    Mishra, Shivakant
    SOCIAL INFORMATICS (SOCINFO 2015), 2015, 9471 : 49 - 66
  • [19] Friendship Paradox and Hashtag Embedding in the Instagram Social Network
    Serafimov, David
    Mirchev, Miroslav
    Mishkovski, Igor
    ICT INNOVATIONS 2019: BIG DATA PROCESSING AND MINING, 2019, 1110 : 121 - 133
  • [20] Genre Diversity of the Instagram Social Network in a Linguodidactic Perspective
    Kolesnikov, Andrei A.
    TOMSK STATE UNIVERSITY JOURNAL, 2021, (472): : 178 - 188