A Multimodal Connectionist Architecture for Unsupervised Grounding of Spatial Language

被引:10
|
作者
Vavrecka, Michal [1 ]
Farkas, Igor [2 ]
机构
[1] Czech Tech Univ, Dept Cybernet, CR-16635 Prague, Czech Republic
[2] Comenius Univ, Dept Appl Informat, Bratislava 84248, Slovakia
关键词
Unsupervised learning; Self-organizing map; Symbol grounding; Spatial phrases; Multimodal representations; SELF-ORGANIZING NETWORK; WORDS; MODEL;
D O I
10.1007/s12559-013-9212-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a bio-inspired unsupervised connectionist architecture and apply it to grounding the spatial phrases. The two-layer architecture combines by concatenation the information from the visual and the phonological inputs. In the first layer, the visual pathway employs separate 'what' and 'where' subsystems that represent the identity and spatial relations of two objects in 2D space, respectively. The bitmap images are presented to an artificial retina and the phonologically encoded five-word sentences describing the image serve as the phonological input. The visual scene is hence represented by several self-organizing maps (SOMs) and the phonological description is processed by the Recursive SOM that learns to topographically represent the spatial phrases, represented as five-word sentences (e.g., 'blue ball above red cup'). Primary representations from the first-layer modules are unambiguously integrated in a multimodal second-layer module, implemented by the SOM or the 'neural gas' algorithms. The system learns to bind proper lexical and visual features without any prior knowledge. The simulations reveal that separate processing and representation of the spatial location and the object shape significantly improve the performance of the model. We provide quantitative experimental results comparing three models in terms of their accuracy.
引用
收藏
页码:101 / 112
页数:12
相关论文
共 50 条
  • [41] CONNECTIONIST APPROACHES TO LANGUAGE DISORDERS
    HARLEY, TA
    [J]. APHASIOLOGY, 1993, 7 (03) : 221 - 249
  • [42] Visual bootstrapping for unsupervised symbol grounding
    Kittler, Josef
    Shevchenko, Mikhail
    Windridge, David
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PROCEEDINGS, 2006, 4179 : 1037 - 1046
  • [43] Language of Thought: The Connectionist Contribution
    Murat Aydede
    [J]. Minds and Machines, 1997, 7 : 57 - 101
  • [44] Unsupervised multimodal processing
    Nyamapfene, Abel
    Ahmad, Khurshid
    [J]. PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2007, : 14 - +
  • [45] Unsupervised connectionist network for fault diagnosis of helicopter gearboxes
    Jammu, VB
    Danai, K
    Lewicki, DG
    [J]. AMERICAN HELICOPTER SOCIETY - 53RD ANNUAL FORUM PROCEEDINGS, VOLS 1 AND 2, 1997, : 1297 - 1307
  • [46] A taxonomy for spatiotemporal connectionist networks revisited:: The unsupervised case
    Barreto, GD
    Araújo, AFR
    [J]. NEURAL COMPUTATION, 2003, 15 (06) : 1255 - 1320
  • [47] Efficient Grounding of Abstract Spatial Concepts for Natural Language Interaction with Robot Manipulators
    Paul, Rohan
    Arkin, Jacob
    Roy, Nicholas
    Howard, Thomas M.
    [J]. ROBOTICS: SCIENCE AND SYSTEMS XII, 2016,
  • [48] Efficient grounding of abstract spatial concepts for natural language interaction with robot platforms
    Paul, Rohan
    Arkin, Jacob
    Aksaray, Derya
    Roy, Nicholas
    Howard, Thomas M.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (10): : 1269 - 1299
  • [49] Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
    Chen, Shizhe
    Guhur, Pierre-Louis
    Tapaswi, Makarand
    Schmid, Cordelia
    Laptev, Ivan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [50] GROUNDING LANGUAGE IN PERCEPTION
    SISKIND, JM
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 1995, 8 (5-6) : 371 - 391