Element similarity in high-dimensional materials representations

被引:2
|
作者
Onwuli, Anthony [1 ]
Hegde, Ashish V. [1 ]
Nguyen, Kevin V. T. [1 ]
Butler, Keith T. [2 ]
Walsh, Aron [1 ,3 ]
机构
[1] Imperial Coll London, Dept Mat, London SW7 2AZ, England
[2] UCL, Dept Chem, London WC1H 0AJ, England
[3] Ewha Womans Univ, Dept Phys, Seoul 03760, South Korea
来源
DIGITAL DISCOVERY | 2023年 / 2卷 / 05期
基金
英国工程与自然科学研究理事会;
关键词
BINARY COMPOUNDS; RADII;
D O I
10.1039/d3dd00121k
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The traditional display of elements in the periodic table is convenient for the study of chemistry and physics. However, the atomic number alone is insufficient for training statistical machine learning models to describe and extract composition-structure-property relationships. Here, we assess the similarity and correlations contained within high-dimensional local and distributed representations of the chemical elements, as implemented in an open-source Python package ElementEmbeddings. These include element vectors of up to 200 dimensions derived from known physical properties, crystal structure analysis, natural language processing, and deep learning models. A range of distance measures are compared and a clustering of elements into familiar groups is found using dimensionality reduction techniques. The cosine similarity is used to assess the utility of these metrics for crystal structure prediction, showing that they can outperform the traditional radius ratio rules for the structural classification of AB binary solids. Elements can be represented as vectors in a high-dimensional chemical space. We explore the distance and correlation between these vectors for different machine learning models.
引用
收藏
页码:1558 / 1564
页数:7
相关论文
共 50 条
  • [1] High-dimensional similarity joins
    Shim, K
    Srikant, R
    Agrawal, R
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (01) : 156 - 171
  • [2] High-dimensional similarity joins
    Shim, K
    Srikant, R
    Agrawal, R
    [J]. 13TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING - PROCEEDINGS, 1997, : 301 - 311
  • [3] Progressive high-dimensional similarity join
    Tok, Wee Hyong
    Bressan, Stephane
    Lee, Mong-Li
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, 4653 : 233 - +
  • [4] High-dimensional activity landscape representations
    Stumpfe, Dagmar
    Bajorath, Juergen
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 244
  • [5] On high-dimensional representations of knot groups
    Friedl, Stefan
    Heusener, Michael
    [J]. ALGEBRAIC AND GEOMETRIC TOPOLOGY, 2018, 18 (01): : 313 - 332
  • [6] High-dimensional similarity retrieval using dimensional choice
    Tahmoush, Dave
    Samet, Hanan
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2, 2008, : 490 - 497
  • [7] High-dimensional similarity retrieval using dimensional choice
    Tahmoush, Dave
    Samet, Hanan
    [J]. SISAP 2008: FIRST INTERNATIONAL WORKSHOP ON SIMILARITY SEARCH AND APPLICATIONS, PROCEEDINGS, 2008, : 35 - 42
  • [8] Similarity Query Processing for High-Dimensional Data
    Qin, Jianbin
    Wang, Wei
    Xiao, Chuan
    Zhang, Ying
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (12): : 3437 - 3440
  • [9] Similarity Learning for High-Dimensional Sparse Data
    Liu, Kuan
    Bellet, Aurelien
    Sha, Fei
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 653 - 662
  • [10] Fast similarity search for high-dimensional dataset
    Wang, Quan
    You, Suya
    [J]. ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 799 - +