Automated Extraction of Structured Data from the Social Network Instagram

被引:0
|
作者
Frantis, Petr [1 ]
Bures, Michel [1 ]
Coufalikova, Aneta [1 ]
Klaban, Ivo [1 ]
机构
[1] Univ Def, Fac Mil Technol, Dept Informat & Cyber Operat, Brno, Czech Republic
关键词
Instagram; Profiling; Instagram Private API; Automation; Osintgram; !text type='Python']Python[!/text;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper explores the extraction of structured information from the social network Instagram through a suitable application programming interface, namely the unofficial Instagram Private API. It focuses on creating a computer program that identifies which posts a user has tagged as "Likes" and then stores this information for profiling specific user profiles. The introduction of the paper highlights the general use of social media in modern society and the importance of personal data for these platforms. It specifies the aim of the study, which is to extract information from Instagram and then analyse it for user profiling. It then describes the evolution of the social network Instagram and key features such as different types of posts. This paper further focuses on the solution and implementation by using Python programming language to minimize the load on Instagram servers and reduce the risk of detection of automated processes. It describes the process of setting up new Instagram accounts, the obstacles in obtaining login credentials, and the need to simulate human behaviour to bypass the network's defence mechanisms. It then focuses on the actual retrieval of information such as the users followed, their posts and information about which posts the user has marked as favourites. It mentions that extracting data from closed profiles is difficult and elaborates on the technical challenges associated with this task. A significant part of this paper is a discussion of Instagram's defence mechanisms that respond to automated computer programs. It describes access denial, account blocking, and identity verification prompts such as CAPTCHA tests. Finally, the conclusion summarizes the results obtained, which indicate the acquisition of approximately 90,000 records for user profiling. It discusses the shortcomings of a fully automated solution due to Instagram's account creation conditions and defence mechanisms. It mentions the need for further research and highlights key gaps and challenges in this area. Overall, the study highlights the technical and security challenges in extracting information from Instagram and emphasises the need for further research and improvements in the technical procedures for extracting data from the platform.
引用
收藏
页码:157 / 164
页数:8
相关论文
共 50 条
  • [31] Quantitative Information Extraction From Social Data
    Alonso, Omar
    Sellam, Thibault
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 1005 - 1008
  • [32] Business Data Extraction from Social Networking
    Khan, Asif Uddin
    Ratha, Bikram Kesari
    2016 3RD INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN INFORMATION TECHNOLOGY (RAIT), 2016, : 651 - 656
  • [33] Automated extraction of interfacial dislocations and disconnections from atomistic data
    Deka, Nipal
    Stukowski, Alexander
    Sills, Ryan B.
    ACTA MATERIALIA, 2023, 256
  • [34] Prospects for the automated extraction of mutation data from the scientific literature
    Peter D Stenson
    David N Cooper
    Human Genomics, 5 (1)
  • [35] An automated method for the extraction of regional data from PET images
    Rusjan, Pablo
    Mamo, David
    Ginovart, Nathalie
    Hussey, Douglas
    Vitcu, Irina
    Yasuno, Fumihiko
    Tetsuya, Suhara
    Houle, Sylvain
    Kapur, Shitij
    PSYCHIATRY RESEARCH-NEUROIMAGING, 2006, 147 (01) : 79 - 89
  • [36] Towards Automated/Semiautomated Extraction of Faults from Lidar Data
    Pope, Paul A.
    Crawford, Brandon M.
    Lavadie-Bulnes, Anita F.
    Schultz-Fellenz, Emily S.
    Milazzo, Damien M.
    Solander, Kurt C.
    Talsma, Carl J.
    PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2022, 88 (06): : 391 - 397
  • [37] Correction to: Automated extraction of revision events from keystroke data
    Rianne Conijn
    Emily Dux Speltz
    Evgeny Chukharev-Hudilainen
    Reading and Writing, 2024, 37 (2) : 509 - 509
  • [38] MARKETING COMMUNICATION STRATEGY TO DEVELOP AN AUDIENCE ON INSTAGRAM SOCIAL NETWORK
    Kuchta, Martin
    Stankova, Monika
    MEGATRENDS AND MEDIA: DIGITAL UNIVERSE, 2019, : 595 - 614
  • [39] Outline of a project for nursing health education on the Instagram social network
    Faustino, Gabriella Picoli dos Santos
    da Silva, Matheus Oliveira
    Filho, Antonio Jose de Almeida
    Ferreira, Marcia de Assuncao
    REVISTA BRASILEIRA DE ENFERMAGEM, 2023, 76 (02)
  • [40] Structured data extraction from the web based on partial tree alignment
    Zhai, Yanhong
    Liu, Bing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (12) : 1614 - 1628