Controlling coverage of D-optimal onion designs and selections

被引:15
|
作者
Olsson, IM [1 ]
Gottfries, J
Wold, S
机构
[1] Umea Univ, Chemometr Res Grp, Dept Chem, SE-90187 Umea, Sweden
[2] AstraZeneca R&D, Med Chem, SE-43183 Molndal, Sweden
关键词
statistical molecular design; space-filling design; D-optimal design; D-optimal onion designs; principal properties; PLS;
D O I
10.1002/cem.901
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Statistical molecular design (SMD) is a powerful approach for selection of compound sets in medicinal chemistry and quantitative structure-activity relationships (QSARs) as well as other areas. Two techniques often used in SMD are space-filling and D-optimal designs. Both on occasions lead to unwanted redundancy and replication. To remedy such shortcomings, a generalization of D-optimal selection was recently developed. This new method divides the compound candidate set into a number of subsets ('layers' or 'shells'), and a D-optimal selection is made from each layer. This improves the possibility to select representative molecular structures throughout any property space independently of requested sample size. This is important in complex situations where any given model is unlikely to be valid over the whole investigated domain of experimental conditions. The number of selected molecules can be controlled by varying the number of subsets or by altering the complexity of the model equation in each layer and/or the dependency of previous layers. The new method, called D-optimal onion design (DOOD), will allow the user to choose the model equation complexity independently of sample size while still avoiding unwarranted redundancy. The focus of the present work is algorithmic improvements of DOOD in comparison with classical D-optimal design. As illustrations, extended DOODs have been generated for two applications by in-house programming, including some modifications of the D-optimal algorithm. The performances of the investigated approaches are expected to differ depending on the number of principal properties of the compounds in the design, sample sizes and the investigated model, i.e. the aim of the design. QSAR models have been generated from the selected compound sets, and root mean squared error of prediction (RMSEP) values have been used as measures of performance of the different designs. Copyright (C) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:548 / 557
页数:10
相关论文
共 50 条
  • [1] D-Optimal Designs with Interaction Coverage
    Hoskins, Dean S.
    Colbourn, Charles J.
    Montgomery, Douglas C.
    [J]. JOURNAL OF STATISTICAL THEORY AND PRACTICE, 2009, 3 (04) : 817 - 830
  • [2] D-optimal onion designs in statistical molecular design
    Olsson, IM
    Gottfries, J
    Wold, S
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2004, 73 (01) : 37 - 46
  • [3] D-optimal designs
    deAguiar, PF
    Bourguignon, B
    Khots, MS
    Massart, DL
    PhanThanLuu, R
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1995, 30 (02) : 199 - 210
  • [4] A FAMILY OF D-OPTIMAL DESIGNS
    WHITEMAN, AL
    [J]. ARS COMBINATORIA, 1990, 30 : 23 - 26
  • [5] Complex D-optimal designs
    Cohn, JHE
    [J]. DISCRETE MATHEMATICS, 1996, 156 (1-3) : 237 - 241
  • [6] Notes on D-optimal designs
    Neubauer, MG
    Watkins, W
    Zeitlin, J
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 1998, 280 (2-3) : 109 - 127
  • [7] Almost D-optimal designs
    Cohn, JHE
    [J]. UTILITAS MATHEMATICA, 2000, 57 : 121 - 128
  • [8] ON THE NUMBER OF D-OPTIMAL DESIGNS
    COHN, JHE
    [J]. JOURNAL OF COMBINATORIAL THEORY SERIES A, 1994, 66 (02) : 214 - 225
  • [9] D-OPTIMAL DESIGNS IN QSAR
    BARONI, M
    CLEMENTI, S
    CRUCIANI, G
    KETTANEHWOLD, N
    WOLD, S
    [J]. QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1993, 12 (03): : 225 - 231
  • [10] D-optimal designs and group divisible designs
    Tamura, Hiroki
    [J]. JOURNAL OF COMBINATORIAL DESIGNS, 2006, 14 (06) : 451 - 462