LINGO-DL: a text-based approach for molecular similarity searching

被引:1
|
作者
Abdo, Ammar [1 ]
Pupin, Maude [1 ]
机构
[1] Univ Lille, Villeneuve Dascq, France
关键词
Drug discovery; Molecular fingerprints; Ligand-based virtual screening; SMILES; LINGO;
D O I
10.1007/s10822-021-00383-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.
引用
收藏
页码:657 / 665
页数:9
相关论文
共 50 条
  • [31] Text-Based Emotion Recognition Using Deep Learning Approach
    Bharti, Santosh Kumar
    Varadhaganapathy, S.
    Gupta, Rajeev Kumar
    Shukla, Prashant Kumar
    Bouye, Mohamed
    Hingaa, Simon Karanja
    Mahmoud, Amena
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [32] Text-Based Emotion Recognition Using Deep Learning Approach
    Bharti, Santosh Kumar
    Varadhaganapathy, S.
    Gupta, Rajeev Kumar
    Shukla, Prashant Kumar
    Bouye, Mohamed
    Hingaa, Simon Karanja
    Mahmoud, Amena
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [33] Nostalgia in European Party Politics: A Text-Based Measurement Approach
    Mueller, Stefan
    Proksch, Sven-Oliver
    [J]. BRITISH JOURNAL OF POLITICAL SCIENCE, 2024, 54 (03) : 993 - 1005
  • [34] A text-based approach to feature modelling: Syntax and semantics of TVL
    Classen, Andreas
    Boucher, Quentin
    Heymans, Patrick
    [J]. SCIENCE OF COMPUTER PROGRAMMING, 2011, 76 (12) : 1130 - 1143
  • [35] Climate change exposure and corporate culture: A text-based approach
    Treepongkaruna, Sirimon
    Jiraporn, Pornsit
    Kyaw, Khine
    Padungsaksawasdi, Chaiyuth
    [J]. INTERNATIONAL REVIEW OF ECONOMICS & FINANCE, 2024, 95
  • [36] Machine learning in bank merger prediction: A text-based approach 
    Katsafados, Apostolos G.
    Leledakis, George N.
    Pyrgiotakis, Emmanouil G.
    Androutsopoulos, Ion
    Fergadiotis, Manos
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 312 (02) : 783 - 797
  • [37] Detected text-based image retrieval approach for textual images
    Unar, Salahuddin
    Wang, Xingyuan
    Zhang, Chuan
    Wang, Chunpeng
    [J]. IET IMAGE PROCESSING, 2019, 13 (03) : 515 - 521
  • [38] MAPPING CITATION PRACTICES IN ACADEMIC WRITING: A TEXT-BASED APPROACH
    Burada, Marinela
    [J]. 11TH CONFERENCE ON BRITISH AND AMERICAN STUDIES: EMBRACING MULTITUDES OF MEANING, 2015, : 96 - 116
  • [39] Searching OCR'ed text: An LDA based Approach
    Hassan, Ehtesham
    Garg, Vikram
    Haque, S. K. Mirajul
    Chaudhury, Santanu
    Gopal, M.
    [J]. 11TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR 2011), 2011, : 1210 - 1214
  • [40] An Approach to Generate Text-Based IDEs for Syntax Completion Based on Syntax Specification
    Sasano, Isao
    [J]. PROCEEDINGS OF THE 2020 ACM SIGPLAN WORKSHOP ON PARTIAL EVALUATION AND PROGRAM MANIPULATION (PEPM '20), 2020, : 38 - 44