Six new full-atom metrics,

were generated for the template based modeling portion of CASP8. Following are descriptions of what we attempted to measure.

MolProbity Score (MPscore)

MolProbity score and mainchain reality score, are based only on properties of the predicted model. Previous work on all-atom contact analysis demonstrated that protein structures are exquisitely well packed, with interdigitating favorable van der Waals contacts and minimal overlaps between atoms not involved in hydrogen bonds(1). Unfavorable steric clashes are strongly correlated with poor data quality, with clashes reduced nearly to zero in the well-ordered parts of very high-resolution crystal structures(2). From this analysis – originally intended to improve protein core redesign, but since applied also to improving experimental structures – came the clashscore, reported by the program Probe(1); lower numbers indicate better models.

In addition, the details of protein conformation are remarkably relaxed, such as staggered χ angles(1) and even staggered methyls(3). Forces applied to a given local motif in the crowded environment of a folded protein interior can result in a locally strained conformation, but evolution seems to keep significant strain near the minimum needed for function, presumably because protein stability is too marginal to tolerate more. In updates of traditional validation measures, we have compiled statistics from rigorously quality-filtered crystal structures (by resolution, homology, and overall validation scores at the file level, and by B-factor and sometimes by all-atom steric clashes at the residue level). After appropriate smoothing, the resulting multi-dimensional distributions are used to score how "protein-like" each local conformation is relative to known structures, either for sidechain rotamers(3) or for backbone Ramachandran values(4). Rotamer outliers asymptote to < 1% at high resolution, general-case Ramachandran outliers to < 0.05%, and Ramachandran favored to 98% (Fig. 2 of assessment paper).

All-atom contact, rotamer, and Ramachandran criteria are central to the MolProbity structure-validation web site(5), which has become an accepted standard in macromolecular crystallography: MolProbity hosted over 78,000 serious work sessions in the past year (2008). To satisfy a general demand for a single composite metric for model quality, the MolProbity score (MPscore) was defined as:

MPscore = 0.426 *ln(1+clashscore) + 0.33 *ln(1+max(0, rota_out|-1)) + 0.25 *ln(1+max(0, rama_iffy|-2)) + 0.5

where clashscore is defined as the number of unfavorable all-atom steric overlaps ≥ 0.4Å per 1000 atoms(1); rota_out is the percentage of sidechain conformations classed as rotamer outliers, from those sidechains that can be evaluated; and rama_iffy is the percentage of backbone Ramachandran conformations outside the favored region, from those residues that can be evaluated. The coefficients were derived from a log-linear fit to crystallographic resolution on a filtered set of PDB structures, so that a model's MPscore is the resolution at which its individual scores would be the expected values. Thus, lower MPscores are better.

CASP8 marks the first use of the MolProbity score for evaluation of non-experimentally-based structural models. It is a very sensitive and demanding metric, a fact also evident for low-resolution crystal structures or for NMR ensembles. It must be paired with a constraint on compactness, provided by the electron density in crystallographic use and approximately by the GDT score in CASP evaluation. Crystal contacts occasionally alter local conformation, but are too weak to sustain unfavorable strain. Those changes are much smaller than at multimer or ligand interfaces. For CASP8 targets, potential problems between chains or at crystal contacts were addressed as part of defining the assessment units(6).

Mainchain Reality Score (MCRS)

To complement the MolProbity score, it seems desirable to have a model evaluation that (1) only uses backbone atoms in its analysis and (2) takes account of excessive deviations of bond lengths and bond angles from their chemically expected ideal values. For those purposes, the mainchain reality score (MCRS) was developed, defined as follows:

MCRS = 100 – 10*spike – 5*rama_out – 2.5*length_out – 2.5*angle_out

where spike is the per-residue average of the sum of "spike" lengths from Probe (indicating the severity of steric clashes) between pairs of mainchain atoms, rama_out is the percentage of backbone Ramachandran conformations classed as outliers (as opposed to favored or allowed), and length_out and angle_out are the percentages of mainchain bond lengths and bond angles respectively that are outliers > 4σ from ideal(7). The perfect MCRS is 100 (achieved fairly often by predicted models), and any non-idealities are subtracted to yield less desirable scores. The coefficients were set manually to achieve a range of approximately 0-100 for each of the four terms, so that egregious errors in just one of these categories can "make or break" the score. To counter this and achieve a reasonable overall distribution, we truncated the overall MCRS at 0 (necessary for about 14% of all models); note that 0 is already such a bad MCRS that truncation isn’t unduly forgiving of the model. However, we did not discover any models as charmingly dreadful as in CASP6 TBM figure 1(8).

Hydrogen Bond Correctness (HBmc and HBsc)

The last 4 of these 6 new full-model metrics are based on comparisons between the predicted model and the target structure. Knowing the importance of H bonds in determining the specificity of protein folds(9), the CASP7 TBM assessors examined H bond correctness relative to the target(10). We have followed their lead but separated categories for mainchain (HBmc: mainchain-mainchain only) and sidechain (HBsc: sidechain-mainchain and sidechain-sidechain), using Probe(1) to identify the H-bonds.

Briefly, the approach was to calculate the atom pairs involved in H bonds for the target, do the same for the model, and then score the percentage of H bond pairs in the target correctly recapitulated in the model. Probe defines hydrogen bonding rather strictly, as donor-acceptor pairs closer than van der Waals contact. That definition was used for all target H bonds and for mainchain H bonds in the models, which often reached close to 100% match (see Results). However, it is more difficult to predict sidechain H bonds, since they require accurately modeling both backbone and sidechains. Therefore, for HBsc model (but not target) H bonds, we also counted donor-acceptor pairs ≤ 0.5Å beyond van der Waals contact; this raised the scores for otherwise good models from the 20-40% range to the 30-80% range. This extended H bond tolerance was readily accomplished using Probe atom selections of "donor, sc" and "acceptor, sc" with the normal 0.5Å diameter probe radius, thus identifying these slightly more distant pairs as well as the usual H bond atom pairs. Note that both HBmc and HBsc measure the match of model to target, since we (like the CASP7 assessors) explicitly required that a model H-bond be between the same pair of named atoms as in the target H-bond.

CASP7 excluded surface H-bonds, but we did not. We believe the best strategy would be in between those two extremes, where sidechain H-bonds would be excluded if they were in regions of uncertain conformation in the target. However, surface H-bonds are generally under- rather than over-represented in crystal structures (perhaps because of high ionic strength in many crystallization media), so prediction of those recognizable in the target should be feasible.

Rotamer Correctness (corRot)

For sidechain rotamers, MolProbity works from smoothed, contoured, multi-dimensional distributions of the high-quality χ-angle data(3)(5); the score value at each point is the percentage of good data that lies outside that contour level. For each individual sidechain conformation, MolProbity looks up the percentile score for its χ-angle values; if that score is ≥ 1%, it assigns the name of the local rotamer peak and if < 1%, it declares an outlier. Rotamer names use a letter for each χ angle (t = trans, m = near -60°, p = near +60°), or an approximate number for final χ angles that significantly differ from one of those 3 values. Using this mechanism, we can define rotamer correctness (corRot) as the match of valid rotamer names between model and target. Note that any model sidechain not in a defined rotamer (i.e., an outlier) is considered non-matching, unless the corresponding target rotamer is also undefined, in which case that residue is simply ignored for corRot. The sidechain rotamers used in SCWRL(11) are quite similar to the MolProbity rotamers, since both are based on recent high-resolution data, quality-filtered at the residue level.

For X-ray targets, the target rotamer set consists of all residues for which a valid rotamer name could be assigned (i.e. not < 1% rotamer score and not undefined because of missing atoms). For NMR targets, we defined the target rotamer set to include only those residues for which one named rotamer comprised a specified percentage (85, 70, 55, and 40% for sidechains with one, two, three, and four χ angles respectively) of the ensemble. We also considered requiring a sufficient number of nuclear Overhauser effect (NOE) restraints for a residue for it to be included, but concluded that in practice this would be largely redundant with the simpler consensus criterion.

Since incorrect 180° flips of Asn/Gln/His sidechains are caused by a systematic error in interpreting electron density maps, there is no reason for them to be wrong by 180° in predicted models, which could thus sometimes improve locally on the deposited target structure. However, we found that applying automatic correction of Asn/Gln/His flips in targets by MolProbity's standard function yielded only 1% or less improvement in any group-average corRot score. We therefore chose not to apply target flips for the final scoring.

Using rotamer names based on multidimensional distributions rather than simple agreement of individual χ1, or χ1 and χ2, values(12)(5)(13) has the advantage of favoring predictions in real local-minimum conformations and with good placement of the functional sidechain ends. However, a disadvantage is that matching is all-or-none; for example, model rotamers tttm and mmmm would be equally “wrong” matches to a target rotamer tttt in our formulation, meaning the corRot score is more stringent for long sidechains. An improved weighting system might be devised for future use.

Sidechain Positioning (GDC-sc)

In order to apply superposition-based scoring to the functional ends of protein sidechains, we developed a GDT-like score called GDC-sc (global distance calculation for sidechains), using a modification of the LGA program(14). Instead of comparing residue positions on the basis of Cαs, GDC-sc uses a characteristic atom near the end of each sidechain type for the evaluation of residue-residue distance deviations. The list of 18 atoms is given by the gdc_at flag in the LGA command shown below, where each one-letter amino-acid code is followed by the PDB-format atom name to be used:

-3 -ie -o1 -sda -d:4 -swap -gdc:10 -gdc_at:V.CG1,L.CD1,I.CD1,P.CG,M.CE,F.CZ,W.CH2,S.OG,T.OG1,C.SG,Y.OH, N.OD1,Q.OE1,D.OD2,E.OE2,K.NZ,R.NH2,H.NE2

or, alternatively with a new flag, just:

-3 -ie -o1 -sda -d:4 -gdc_sc

Gly and Ala are not included, since their positions are directly determined by the backbone. The -swap flag takes care of the possible ambiguity in Asp or Glu terminal oxygen naming.

The traditional GDT-TS score is a weighted sum of the fraction of residues superimposed within limits of 1, 2, 4, and 8Å. For GDC-sc, the LGA backbone superposition is used to calculate fractions of corresponding model-target sidechain atom pairs that fit under 10 distance-limit values from 0.5Å to 5Å, since 8Å would be a displacement too large to be meaningful for a local sidechain difference. The procedure assigns each reference atom to the relevant bin for its model vs. target distance: < 0.5Å, < 1.0Å, ... < 4.5Å, < 5.0Å; for each bin_i, the fraction (Pa_i) of assigned atoms is calculated. Finally the fractions are added and scaled to give a GDC-sc value between 0 and 100, by the formula:

GDC-sc = 100*2*(k*Pa_1 + (k-1)*Pa_2 ... + 1*Pa_k) / (k+1)*k, where k=10.

The goal was a measure sensitive to correct placement of sidechain functional or terminal groups relative to the entire domain, both in the core and forming the surface that makes interactions. The three sidechain measures (HBsc, corRot, and GDC-sc) are meaningful evaluations only for models with an approximately correct overall backbone fold, and so we make use of them only for models with above-average GDT scores.


(1) Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, Richardson JS, Richardson DC. Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. J Mol Biol 1999;285(4):1711-33.

(2) Arendall WB, 3rd, Tempel W, Richardson JS, Zhou W, Wang S, Davis IW, Liu ZJ, Rose JP, Carson WM, Luo M and others. A test of enhancing model accuracy in high-throughput crystallography. J Struct Funct Genomics 2005;6(1):1-11.

(3) Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins 2000;40(3):389-408.

(4) Lovell SC, Davis IW, Arendall WB, 3rd, de Bakker PI, Word JM, Prisant MG, Richardson JS, Richardson DC. Structure validation by Calpha geometry: phi,psi and Cbeta deviation. Proteins 2003;50(3):437-50.

(5) Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, Murray LW, Arendall WB, 3rd, Snoeyink J, Richardson JS and others. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 2007;35(Web Server issue):W375-83.

(6) Tress ML, Ezkurdia, I, Richardson, JS. ?? Domain definition and target classification for CASP8 ?? Proteins 2009;??(Suppl 9):??-??

(7) Engh RA, Huber, R. Accurate bond and angle parameters for X-ray protein structure refinement. Acta Cryst A 1991;47:392-400.

(8) Tress M, Ezkurdia I, Grana O, Lopez G, Valencia A. Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 2005;61 Suppl 7:27-45.

(9) Dill KA, Bromberg S. Molecular Driving Forces: Statistical Thermodynamics in Chemistry and Biology. New York: Garland Science; 2002. 704 pp.

(10)Kopp J, Bordoli L, Battey JN, Kiefer F, Schwede T. Assessment of CASP7 predictions for template-based modeling targets. Proteins 2007;69 Suppl 8:38-56.

(11) Canutescu AA, Shelenkov AA, Dunbrack RL, Jr. A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 2003;12(9):2001-14.

(12) Tramontano A, Morea V. Assessment of homology-based predictions in CASP5. Proteins 2003;53 Suppl 6:352-68.

(13) Read RJ, Chavali G. Assessment of CASP7 predictions in the high accuracy template-based modeling category. Proteins 2007;69 Suppl 8:27-37.

(14) Zemla A. LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res 2003;31(13):3370-4.