Breaking Text-Based CAPTCHAs using Average Vertical Partition

被引:0
|
作者
Liu, Xiyang [1 ]
Zhang, Yang [1 ]
Hu, Jing [1 ]
Tang, Mengyun [1 ]
Gao, Haichang [1 ]
机构
[1] Xidian Univ, Inst Software Engn, Xian 710071, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
CAPTCHA; security; text-based; K-nearest neighbor; average vertical partition;
D O I
10.6688/J1SE.201905_35(3).0008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
CAPTCHA, which stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, has been widely used as a security mechanism to defend against automated registration, spam and malicious bot programs. There have been many successful attacks on CAPTCHAs deployed by popular websites, e.g., Google, Yahoo!, and Microsoft. However, most of these methods are ad hoc, and they have lost efficacy with the evolution of CAPTCHA. In this paper, we propose a simple but effective attack on text-based CAPTCHA that uses machine learning to solve the segmentation and recognition problems simultaneously. The method first divides a CAPTCHA image into average blocks and attempts to combine adjacent blocks to form individual characters. A modified K-Nearest Neighbor (KNN) engine is used to recognize these combinations, and using a Dynamic Programming (DP) graph search algorithm, the most likely combinations are selected as the final result. We tested our attack on the popular CAPTCHAs deployed by the top 20 Alexa ranked websites. The success rates range from 5.0% to 74.0%, illustrating the effectiveness and universality of our method. We also tested the applicability of our method on three well-known CAPTCHA schemes. Our attack casts serious doubt on the security of existing text-based CAPTCHAs; therefore, guidelines for designing better text-based CAPTCHAs are discussed at the end of this paper.
引用
收藏
页码:611 / 634
页数:24
相关论文
共 50 条
  • [31] Enhancing Text-Based Analysis Using Neurophysiological Measures
    Behneman, Adrienne
    Kintz, Natalie
    Johnson, Robin
    Berka, Chris
    Hale, Kelly
    Fuchs, Sven
    Axelsson, Par
    Baskin, Angela
    [J]. FOUNDATIONS OF AUGMENTED COGNITION, PROCEEDINGS: NEUROERGONOMICS AND OPERATIONAL NEUROSCIENCE, 2009, 5638 : 449 - +
  • [32] IZE - TEXT-BASED POWER
    OMALLEY, C
    [J]. PERSONAL COMPUTING, 1988, 12 (11): : 262 - 262
  • [33] Recognition Based Segmentation of Connected Characters in Text Based CAPTCHAs
    Hussain, Rafaqat
    Gao, Hui
    Shaikh, Riaz Ahmed
    Soomro, Shazia Parveen
    [J]. PROCEEDINGS OF 2016 8TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2016), 2016, : 673 - 676
  • [34] Tracking with text-based messages
    Alberola, C
    Cybenko, GV
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (04): : 70 - 78
  • [35] Text-based NP Enrichment
    Elazar, Yanai
    Basmov, Victoria
    Goldberg, Yoav
    Tsarfaty, Reut
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 : 764 - 784
  • [36] Noticing and text-based chat
    Lai, Chun
    Zhao, Yong
    [J]. LANGUAGE LEARNING & TECHNOLOGY, 2006, 10 (03): : 102 - 120
  • [37] Text-Based Industry Momentum
    Hoberg, Gerard
    Phillips, Gordon M.
    [J]. JOURNAL OF FINANCIAL AND QUANTITATIVE ANALYSIS, 2018, 53 (06) : 2355 - 2388
  • [38] DEAFNESS AND TEXT-BASED LITERACY
    PAUL, PV
    [J]. AMERICAN ANNALS OF THE DEAF, 1993, 138 (02) : 72 - 75
  • [39] Text-Based Recession Probabilities
    Massimo Ferrari Minesso
    Laura Lebastard
    Helena Le Mezo
    [J]. IMF Economic Review, 2023, 71 : 415 - 438
  • [40] TEXT-BASED INTELLIGENT SYSTEMS
    JACOBS, PS
    [J]. AI MAGAZINE, 1990, 11 (03) : 30 - 31