LINGO-DL: a text-based approach for molecular similarity searching

被引:1
|
作者
Abdo, Ammar [1 ]
Pupin, Maude [1 ]
机构
[1] Univ Lille, Villeneuve Dascq, France
关键词
Drug discovery; Molecular fingerprints; Ligand-based virtual screening; SMILES; LINGO;
D O I
10.1007/s10822-021-00383-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.
引用
收藏
页码:657 / 665
页数:9
相关论文
共 50 条
  • [21] Text-Based Automatic Personality Recognition: a Projective Approach
    Camati, Ricardo Stegh
    Enembreck, Fabricio
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 218 - 225
  • [22] Comparison of Text-Based and Feature-Based Semantic Similarity Between Android Apps
    Uddin, Md Kafil
    He, Qiang
    Han, Jun
    Chua, Caslon
    [J]. WEB INFORMATION SYSTEMS ENGINEERING, WISE 2020, PT I, 2020, 12342 : 530 - 545
  • [23] Comparison of text-based and linked-based metrics in terms of estimating the similarity of articles
    Goltaji, Marzieh
    Abbaspour, Javad
    Jowkar, Abdolrasool
    Fakhrahmad, Seyed Mostafa
    [J]. JOURNAL OF LIBRARIANSHIP AND INFORMATION SCIENCE, 2024, 56 (03) : 760 - 772
  • [24] VizCommender: Computing Text-Based Similarity in Visualization Repositories for Content-Based Recommendations
    Oppermann, Michael
    Kincaid, Robert
    Munzner, Tamara
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 495 - 505
  • [25] Testing the model-observer similarity hypothesis with text-based worked examples
    Hoogerheide, Vincent
    Loyens, Sofie M. M.
    Jadi, Fedora
    Vrins, Anna
    van Gog, Tamara
    [J]. EDUCATIONAL PSYCHOLOGY, 2017, 37 (02) : 112 - 127
  • [26] Pitfalls in users' evaluation of algorithms for text-based similarity detection in medical education
    Scavnicky, Jakub
    Karolyi, Matej
    Ruzickova, Petra
    Pokorna, Andrea
    Harazim, Hana
    Stourac, Petr
    Komenda, Martin
    [J]. PROCEEDINGS OF THE 2018 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2018, : 109 - 116
  • [27] A Structural Hierarchy Matching Approach for Molecular Similarity/Substructure Searching
    Ji, Shu-Shen
    Dong, Hong-Ju
    Zhou, Xin-Xin
    Liu, Ya-Min
    Zhang, Feng-Xue
    Wang, Qi
    Huang, Xin-An
    [J]. MOLECULES, 2015, 20 (05): : 8791 - 8799
  • [28] Communicative Approach Over Text-Based Indonesian Language Textbooks
    Wati, Ratna
    Tolla, Achmad
    Rahim, A. Rahman
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES (ICSS 2018), 2018, 226 : 1117 - 1122
  • [29] A text-based approach to feature modelling: Syntax and semantics of TVL
    Classen, Andreas
    Boucher, Quentin
    Heymans, Patrick
    [J]. SCIENCE OF COMPUTER PROGRAMMING, 2011, 76 (12) : 1130 - 1143
  • [30] Climate change exposure and corporate culture: A text-based approach
    Treepongkaruna, Sirimon
    Jiraporn, Pornsit
    Kyaw, Khine
    Padungsaksawasdi, Chaiyuth
    [J]. INTERNATIONAL REVIEW OF ECONOMICS & FINANCE, 2024, 95