Cracking the neural code for word recognition in convolutional neural networks

被引:0
|
作者
Agrawal, Aakash [1 ]
Dehaene, Stanislas [1 ,2 ]
机构
[1] Univ Paris Saclay, NeuroSpin Ctr, INSERM U 992, Cognit Neuroimaging Unit,CEA, Gif Sur Yvette, France
[2] Univ Paris Sci Lettres PSL, Coll France, Paris, France
基金
欧盟地平线“2020”;
关键词
LETTER POSITION; WRITTEN WORDS; REPRESENTATION; MODEL; PERCEPTION; INSIGHTS; CORTEX; SPACE; FMRI;
D O I
10.1371/journal.pcbi.1012430
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Learning to read places a strong challenge on the visual system. Years of expertise lead to a remarkable capacity to separate similar letters and encode their relative positions, thus distinguishing words such as FORM and FROM, invariantly over a large range of positions, sizes and fonts. How neural circuits achieve invariant word recognition remains unknown. Here, we address this issue by recycling deep neural network models initially trained for image recognition. We retrain them to recognize written words and then analyze how reading-specialized units emerge and operate across the successive layers. With literacy, a small subset of units becomes specialized for word recognition in the learned script, similar to the visual word form area (VWFA) in the human brain. We show that these units are sensitive to specific letter identities and their ordinal position from the left or the right of a word. The transition from retinotopic to ordinal position coding is achieved by a hierarchy of "space bigram" unit that detect the position of a letter relative to a blank space and that pool across low- and high-frequency-sensitive units from early layers of the network. The proposed scheme provides a plausible neural code for written words in the VWFA, and leads to predictions for reading behavior, error patterns, and the neurophysiology of reading. Reading is a fundamental skill in modern society, yet the neural mechanisms that allow us to quickly recognize words remain poorly understood. Our research aims to unravel how the brain achieves invariant word recognition-the ability to recognize words regardless of their position, size, or font. We studied artificial neural networks trained to recognize words, mirroring human learning. Our findings reveal that these networks develop specialized units for word recognition, similar to the Visual Word Form Area in the human brain. These units are sensitive to specific letters and their positions within a word. Crucially, we discovered that they achieve this by detecting the spaces around words as reference points. This creates a hierarchical system where early layers detect basic features and spaces, while higher layers combine this information to recognize specific letters at certain positions relative to word edges. This "space bigram" model reconciles previous theories of letter bigrams and letter-position coding. Our results suggest that most written languages may be processed using similar basic principles. This understanding could inform better methods for teaching reading and treating reading disorders.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Feature extraction with convolutional neural networks for handwritten word recognition
    Bluche, Theodore
    Ney, Hermann
    Kermorvant, Christopher
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 285 - 289
  • [2] Handwritten English Word Recognition based on Convolutional Neural Networks
    Yuan, Aiquan
    Bai, Gang
    Yang, Po
    Guo, Yanni
    Zhao, Xinting
    13TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2012), 2012, : 207 - 212
  • [3] Cracking the genetic code with neural networks
    Joiret, Marc
    Leclercq, Marine
    Lambrechts, Gaspard
    Rapino, Francesca
    Close, Pierre
    Louppe, Gilles
    Geris, Liesbet
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [4] Synthesizing versus Augmentation for Arabic Word Recognition with Convolutional Neural Networks
    Alaasam, Reem
    Barakat, Berat Kurar
    El-Sana, Jihad
    2018 IEEE 2ND INTERNATIONAL WORKSHOP ON ARABIC AND DERIVED SCRIPT ANALYSIS AND RECOGNITION (ASAR), 2018, : 114 - 118
  • [5] Holistic Handwritten Uyghur Word Recognition Using Convolutional Neural Networks
    Simayi, Wujiahemaiti
    Hamdulla, Askar
    Liu, Cheng-Lin
    PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 846 - 851
  • [6] Convolutional Neural Networks for Phoneme Recognition
    Glackin, Cornelius
    Wall, Julie
    Chollet, Gerard
    Dugan, Nazim
    Cannings, Nigel
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 190 - 195
  • [7] Convolutional Neural Networks for Speech Recognition
    Abdel-Hamid, Ossama
    Mohamed, Abdel-Rahman
    Jiang, Hui
    Deng, Li
    Penn, Gerald
    Yu, Dong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
  • [8] Convolutional neural networks for face recognition
    Lawrence, S
    Giles, CL
    Tsoi, AC
    1996 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1996, : 217 - 222
  • [9] Language recognition by convolutional neural networks
    Pour, L. Khosravani
    Farrokhi, A.
    SCIENTIA IRANICA, 2023, 30 (01) : 116 - 123
  • [10] Classifying Code Commits with Convolutional Neural Networks
    Meng, Na
    Jiang, Zijian
    Zhong, Hao
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,