Large-Scale Comparison of Alternative Similarity Search Strategies with Varying Chemical Information Contents

被引:4
|
作者
Laufkoetter, Oliver [1 ]
Miyao, Tomoyuki [2 ,3 ]
Bajorath, Juergen [1 ]
机构
[1] Rheinische Friedrich Wilhelms Univ, Dept Life Sci Informat, Chem Biol & Med Chem, LIMES Program Unit,B IT, Endenicher Allee 19c, D-53115 Bonn, Germany
[2] Nara Inst Sci & Technol, Data Sci Ctr, 8916-5 Takayama Cho, Ikoma, Nara 6300192, Japan
[3] Nara Inst Sci & Technol, Grad Sch Sci & Technol, 8916-5 Takayama Cho, Ikoma, Nara 6300192, Japan
来源
ACS OMEGA | 2019年 / 4卷 / 12期
关键词
FINGERPRINTS; INCREASES; FEATURES;
D O I
10.1021/acsomega.9b02470
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Similarity searching (SS) is a core approach in computational compound screening and has a long tradition in pharmaceutical research. Over the years, different approaches have been introduced to increase the information content of search calculations and optimize the ability to detect compounds having similar activity. We present a large-scale comparison of distinct search strategies on more than 600 qualifying compound activity classes. Challenging test cases for SS were identified and used to evaluate different ways to further improve search performance, which provided a differentiated view of alternative search strategies and their relative performance. It was found that search results could not only be improved by increasing compound input information but also by focusing similarity calculations on database compounds. In the presence of multiple active reference compounds, asymmetric SS with high weights on chemical features of target compounds emerged as an overall preferred approach across many different activity classes. These findings have implications for practical virtual screening applications.
引用
收藏
页码:15304 / 15311
页数:8
相关论文
共 50 条
  • [21] TLCSim: A Large-Scale Two-Level Clustering Similarity Search with MapReduce
    Trong Nhan Phan
    Jager, Markus
    Nadschlager, Stefan
    Gomez-Perez, Pablo
    Huber, Christian
    Kung, Josef
    Cong An Nguyen
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2016, 2016, 10018 : 53 - 71
  • [22] Scalable Similarity Search in Seismology: A New Approach to Large-Scale Earthquake Detection
    Bergen, Karianne
    Yoon, Clara
    Beroza, Gregory C.
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2016, 2016, 9939 : 301 - 308
  • [23] Scalable similarity search in seismology: A new approach to large-scale earthquake detection
    Bergen, Karianne (kbergen@stanford.edu), 1600, Springer Verlag (9939 LNCS):
  • [24] A large-scale distributed framework for information retrieval in large dynamic search spaces
    Eugene Santos
    Eunice E. Santos
    Hien Nguyen
    Long Pan
    John Korah
    Applied Intelligence, 2011, 35 : 375 - 398
  • [25] A large-scale distributed framework for information retrieval in large dynamic search spaces
    Santos, Eugene, Jr.
    Santos, Eunice E.
    Hien Nguyen
    Pan, Long
    Korah, John
    APPLIED INTELLIGENCE, 2011, 35 (03) : 375 - 398
  • [26] ALTERNATIVE OPTIMIZATION STRATEGIES FOR LARGE-SCALE PRODUCTION-ALLOCATION PROBLEMS
    LOOTSMA, FA
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1994, 75 (01) : 13 - 40
  • [27] Large-scale identification of genetic design strategies using local search
    Lun, Desmond S.
    Rockwell, Graham
    Guido, Nicholas J.
    Baym, Michael
    Kelner, Jonathan A.
    Berger, Bonnie
    Galagan, James E.
    Church, George M.
    MOLECULAR SYSTEMS BIOLOGY, 2009, 5
  • [28] USE OF A NONUNIQUE NOTATION IN A LARGE-SCALE CHEMICAL INFORMATION SYSTEM
    LEFKOVITZ, D
    JOURNAL OF CHEMICAL DOCUMENTATION, 1967, 7 (04): : 192 - +
  • [29] Research Clinics: An Alternative Model for Large-Scale Information Literacy Instruction
    Koelling, Glenn
    Towsend, Lori
    COMMUNICATIONS IN INFORMATION LITERACY, 2019, 13 (01) : 75 - 90
  • [30] Monitoring and information fusion for search and rescue operations in large-scale disasters
    d'Agostino, F
    Farinelli, A
    Grisetti, G
    Iocchi, L
    Nardi, D
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOL I, 2002, : 672 - 679