|
Statistical Molecular Design, part 3 |
|
|
|
|
Written by Lennart Eriksson, Per M Andersson, Erik Johansson and Torbjörn Lundstedt
|
The design of a training set intended for QSAR according to the principles of SMD, makes it possible to explore - in a systematic way - the structure-activity relationships within the data-set in question. By interrogating the resulting QSAR model it is then possible to extract clues of how to modify the chemical properties of the compounds in order to possibly enhance their biological activity profile.
Particularly, it is of interest to compute predictions of biological properties of compounds which have not yet been fabricated. This process is sometimes known as virtual screening. An important benefit of QSAR-based virtual screening, is that the QSAR model itself constitutes a navigation tool. One need not make predictions for all possible combinatorial options in a molecular structure. Rather, the QSAR-model can be used to direct the virtual screening towards inducing changes only in the substituents and moieties it finds important. In order to illustrate the concept of QSAR-directed virtual screening, we will review a data set of hexapeptides for which the training set was designed according to SMD. Prior to relating this story, though, we will need to review the basic principles underpinning peptide QSAR, and particularly the concept of the "z-scales".
The two preceding tutorials here at Chemometrics.se (see previous tutorials) have concerned the use of statistical molecular design (SMD) in the design of sets of representative, informative and diverse molecules. SMD is an efficient tool to accomplish a lead-centered design in drug discovery and design. In so doing, the SMD-protocol is actually used to develop a new series of molecules. This is in sharp contrast to the situation often prevailing within e.g. environmental chemistry and toxicology, where QSAR-techniques are utilized to select sub-sets of representative compounds.
|