README.probe Home Page: Richardsons' Laboratory J. Michael Word - 7/2001


Contents:



Introduction


The program Probe generates "contact dots" at points on the van der Waals surface of atoms which are in close proximity to other atoms [1]; reading atomic coordinates in protein databank (PDB) format files and writing color-coded dot lists (spikes where atoms clash) for inclusion in a kinemage.

Directly based on the "sp" program by Zalis and Richardson (following the work of Connolly), the approach is to place a small probe (typically of radius 0.25 Å) at points along the van der Waals surface of a selected set of atoms and determine if this probe also contacts atoms within a second "target" set. A flexible method for selecting the source and target atoms is available along with command line flags for altering the probe radius and dot density. Although Probe can generate "surface dots" were there are no nearby atoms, its primary use is to analyze atomic packing. For packing analysis and structure validation, Probe can generate contact surfaces within a set of atoms ("SELF dots").

For meaningful use of Probe in the study of molecular structures, coordinates for all hydrogen atoms must be included in the model. Modeling with "implicit hydrogens" is inadequate since the vast majority of steric interactions which constrain conformational choices take place among hydrogens. A program called Reduce, also available from the Richardson lab, uses simple geometric considerations to add hydrogens to a PDB file and optimize their orientations [2].

Probe has many options which modify the way output is formatted. Instead of kinemage format, it can write graphical information in O or XtalView format. It can calculate a table of dot information with contact score values and percent dot coverage. Finally, it can produce a detailed "unformatted" description of each dot, including source and target atom names, distances, atom types, and partial scores. Because Probe is very flexible, it is helpful to develop a working knowledge of options and especially selection criteria.



New Features


v2.4
(7/11/01)
Buttons are no longer generated for each element type by default. To generate these buttons use the -element flag.

v2.5
(7/25/01)
Selections can now refer to negative residue numbers. (Sometimes you need to include extra parentheses or a space to prevent the selection from being treated as a command line option.)


Running Probe


Probe was designed for UNIX and the commands described below follow the UNIX conventions. For a brief description of Probe features, run Probe without any options. The command "probe -h" will give a more complete description of program options.

In its most basic form, the syntax is:

probe input.pdb >> outputDots.kin

which will generate SELF dots for all atoms in the input file except alternate (e.g., B or C) conformations and append them to the end of the kinemage file. Note the ">>" redirection symbol which stands for APPEND; in the normal case, Prekin would be used to make a kinemage of the molecular structure and Probe would be used to append the dot information.


A more extensive set of command line options is available in the format

probe [-flags] "pattern1" ["pattern2"] input.pdb [more.pdbs...] [>> outfile]

(the parts in square brackets may be optional; by default the results go to standard output).

There are four modes set by command line flags (-SELF is assumed if not given):

probe -SELF "pattern1"            inputfiles >> kinfile
   ### Intersect 1->1
probe -BOTH "pattern1" "pattern2" inputfiles >> kinfile
   ### Intersect 1->2 and 2->1
probe -ONCE "pattern1" "pattern2" inputfiles >> kinfile
   ### Intersect 1->2
probe -OUT  "pattern1"            inputfiles >> kinfile
   ### External surface

(How the selected atoms interact is listed above as a comment after the hash mark.)

By default, HET groups and waters are included in the dot calculations but *NOT* mainchain to mainchain interactions. These settings may be changed with the -NOHET, -NOWATER and -MC flags.

The flag -U is used to dump 'unformatted' dot information which can be sent to other programs or scripts for analysis. The flag -STDBONDS will make Probe consult an internal table when deciding the bonding pattern. This is used in modeling where impossible conformations may be analyzed without the problem of improper bonding patterns being inferred from atomic distances.



Patterns


The use of patterns to specify the interaction being examined is illustrated with the following examples:

probe "altA blt40" 1filH.pdb >> lowBdot.kin

calculates self packing in all atoms from the file 1filH.pdb with a temperature factor less than 40 and an alternate conformation code of blank or "A" (-self is the default and the pattern is in quotes because it contains a space). This is a useful pattern for validating a structure because it ignores atoms which may have poorly determined coordinates. In other situations, the pattern could be replaced with "all" to select all the atoms.

To identify the interface between chain E and chain I in the file enzH.pdb

probe -both "chainE" "chainI" enzH.pdb >> interface.kin

To create a table of contact statistics use -count

probe -count -self "all" mypdbH > dotinfo.table

The example also shows the use of a single '>' mark; the UNIX signal to overwrite (!) rather than append to the output file.

Even more dot information for each dot can be tabulated with -unformated

probe -unformated -self "all" mypdbH > rawinfo.table

You can create surface dots

probe -out all 1filH.pdb >> surfacedots.kin

These dots are equivalent to the non-reentrant part of a Connolly surface. When using surface dots, it is sometimes useful to expand or contract the probe radius using the -rad#.# flag (e.g. -rad1.4 for a water size probe, or -rad0.0 to see a sphere-like representation of residues).

Finally, here is a sequence of Prekin and Probe commands which can create a kinemage where each category of contact is broken down separately. The patterns used give some sense of the level of control Probe permits.

prekin -lots input.pdb outputdot.kin
probe -3 -lens -q -name scsc -self "sc alta blt40 ogt33" input.pdb >> outputdot.kin
probe -3 -lens -q -name scmc -both "sc alta blt40 ogt33" "mc alta blt40 ogt33" input.pdb >> outputdot.kin
probe -3 -lens -q -name mcmc -mc -self "mc alta blt40 ogt33" input.pdb >> outputdot.kin
probe -3 -lens -q -name wathet -both "het,water alta blt40 ogt65,(not water ogt33)" \
"not(het,water) alta blt40 ogt33" input.pdb >> outputdot.kin


Internal program help text


Syntax: probe input.pdb >> out.kin
    or: probe [flags] "src pattern" ["target pattern"] pdbfiles... >> out.kin

Flags:
  -SElf  self intersection:   src  —> src (default)
  -Both  intersect both ways: src <=> targ
  -ONce  single intersection: src  —> targ
  -OUt   external van der Waals surface of src (solvent contact surface)

  -AUTObondrot filename    read and process an autobondrot file

  shortcuts:
  -SCAN0 same as: -3 -mc -self "alta blt40 ogt33"
  -SCAN1 same as: -3 -once "sc alta blt40 ogt33" "alta blt40 ogt65,(not water ogt33)"
  -SCSurface same as: -drop -rad1.4         -out "not water"
  -EXPOsed   same as: -drop -rad1.4         -out      (note: user supplies pattern)
  -ASurface  same as: -drop -rad0.0 -add1.4 -out "not water"
  -ACCESS    same as: -drop -rad0.0 -add1.4 -out      (note: user supplies pattern)

  -DUMPAtominfo   count the atoms in the selection: src

  (note that BOTH and ONCE require two patterns while
   OUT, SELF and DUMPATOMINFO require just one pattern)

  -Implicit    implicit hydrogens
  -Explicit    explicit hydrogens (default)
  -DEnsity#    set dot density (default 16 dots/sq A)
  -Radius#.#   set probe radius (default 0.25 A)
  -ADDvdw#.#   offset added to Van der Waals radii (default 0.0)
  -SCALEvdw#.# scale factor for Van der Waals radii (default 1.0)
  -COSCale#.#  scale C=O carbon Van der Waals radii (default 0.94)
  -SPike       draw spike instead of dots (default)
  -SPike#.#    set spike scale (default=0.5)
  -NOSpike     draw only dots
  -HBRegular#.# max overlap for regular Hbonds(default=0.6)
  -HBCharged#.# max overlap for charged Hbonds(default=0.8)
  -Keep        keep nonselected atoms (default)
  -DRop        drop nonselected atoms
  -LIMit       limit bump dots to max dist when kissing (default)
  -NOLIMit     do not limit bump dots
  -LENs        add lens keyword to kin file
  -NOLENs      do not add lens keyword to kin file (default)
  -MC          include mainchain->mainchain interactions
  -HETs        include dots to non-water HET groups (default)
  -NOHETs      exclude dots to non-water HET groups
  -WATers      include dots to water (default)
  -NOWATers    exclude dots to water
  -WAT2wat     show dots between waters
  -DUMPH2O     include water H? vectorlist in output
  -4H          extend bond chain dot removal to 4 for H (default)
  -3           limit bond chain dot removal to 3
  -2           limit bond chain dot removal to 2
  -1           limit bond chain dot removal to 1
  -IGNORE "pattern" explicit drop: ignore atoms selected by pattern
  -CHO#.#      scale factor for CH..O Hbond score (default=0.5)
  -PolarH      use short radii of polar hydrogens (default)
  -NOPolarH    do not shorten radii of polar hydrogens

  -NOFACEhbond do not identify HBonds to aromatic faces
  -Name "name" specify the group name (default "dots")
  -Countdots   produce a count of dots-not a dotlist
  -Unformated  output raw dot info
  -OFORMAT     output dot info formatted for display in O
  -XVFORMAT    output dot info formatted for display in XtalView
  -GAPcolor    color dots by gap amount (default)
  -ATOMcolor   color dots by atom type
  -BASEcolor   color dots by nucleic acid base type
  -COLORBase   color dots by gap and nucleic acid base type
  -OUTCOLor "name" specify the point color for -OUT (default "gray")
  -GAPWeight#  set weight for scoring gaps (default 0.25)
  -BUMPWeight# set relative scale for scoring bumps (default 10.0)
  -HBWeight#   set relative scale for scoring Hbonds (default 4.0)
  -DIVLow#.#   Division for Bump categories    (default -0.4)
  -DIVHigh#.#  Division for Contact categories (default  0.25)
  -KINemage    add @kinemage 1 statement to top of .kin format output
  -NOGroup     do not generate @group statement in .kin format output
  -ELEMent     add master buttons for different elements in kin output
  -NOHBOUT     do not output contacts for HBonds
  -NOCLASHOUT  do not output contacts for clashes
  -NOVDWOUT    do not output contacts for van der Waals interactions
  -NOTICKs     do not display the residue name ticker during processing
  -STDBONDs    assume only standard bonding patterns in standard residues
  -NOPARENT    do not bond hydrogens based on table of parent heavy atoms

  -SEGID       use the PDB SegID field to descriminate between residues
  -OLDU        generate old style -u output: kissEdge2BullsEye, etc
  -VErbose     verbose mode (default)
  -REFerence   display reference string
  -Quiet       quiet mode

  -Help  show expanded help notice (includes other flags)

Pattern elements:  (should be put in quotes on the command line)
   FILE#     within file #
   MODEL#    within model #
   CHAINa    within chain a
   SEGaaaa   segment identifier aaaa (where _ represents blank)
   ALTa      alternate conformation a
   ATOMaaaa  atom name aaaa (where _ represents blank)
             (all 4 characters are used so H would be ATOM_H__)
   RESaaa    residue aaa
   #         residue #
   #-#       residue range #
   res       residue type by one or three letter codes
   ALL,PROTEIN,MC,SC,BASE,ALPHA,BETA,NITROGEN,CARBON,OXYGEN,
   SULFUR,PHOSPHORUS,HYDROGEN,METAL,POLAR,NONPOLAR,CHARGED,
   DONOR,ACCEPTOR,AROMATIC,METHYL,HET,WATER,DNA,RNA
             all or a subset of the atoms
   OLT#      Occupancy less than # (integer percent)
   OGT#      Occupancy greater than # (integer percent)
   BLT#      B-value less than # (integer)
   BGT#      B-value greater than # (integer)
   
   WITHIN #.# OF #.#, #.#, #.#   atoms within distance from point
   
   Patterns can be combined into comma separated lists
   such as "trp,phe,tyr" meaning TRP or PHE or TYR.
   
   Patterns that are sepatated by blanks must all be true
   such as "chainb 1-5" meaning residues 1 to 5 in chain B.
   
   You can also group patterns with parenthesis, separate
   multiple patterns with | meaning 'or' and choose the
   complement with NOT as in "not file1" meaning not in file 1.

   An autobondrot file is similar to other PDB input files
   but it includes information identifying atoms subject to rotations
   and other transformations.

   Example autobondrot file fragment showing Calpha-Cbeta bond rotation
   and a periodic torsion penalty function for this rotation
     ATOM      1  CB  TYR    61      34.219  17.937   4.659  1.00  0.00
     bondrot:chi1:78.7:  0:359:5:33.138:18.517: 5.531:34.219:17.937: 4.659
     cos:-3:60:3:
     ATOM      1 1HB  TYR    61      34.766  18.777   4.206  1.00  0.00
     ATOM      1 2HB  TYR    61      34.927  17.409   5.315  1.00  0.00
     ATOM      1  CG  TYR    61      33.836  16.989   3.546  1.00  0.00
     ...
   Autobondrot commands use colons to separate values
   Transformations: BONDROT:id:currAng:start:end:stepSz:x1:y1:z1:x2:y2:z2
                    TRANS:  id:currpos:start:end:stepSz:x1:y1:z1:x2:y2:z2
                    NULL  # dummy
   Bias functions:  COS:scale:phaseOffset:frequency
                    POLY:scale:offset:polynomialDegree
                    CONST:value
   Branching:       SAVE and RESTORE or "(" and ")"
           (e.g. to rotate each Chi and the methyls for isoleucine the
            sequence is: rotChi1/SAVE/rotChi2/rotCD1/RESTORE/rotCG2)
   Set orientation: GO:angle1:angle2:...
   Include files:   @filename
   Comments:        # comment text

Probe: version 2.6 10/01/2001, Copyright 1996-2001, J. Michael Word


Autobondrot


Starting with version 2.0, Probe has been extended to read specially marked up fragments of a PDB file which describe dihedral rotations as well as other transformations. The command line flag -autobondrot preceeds the filename and causes the file to be interpreted as a script for scanning a range of conformations. A description of the format of these rotation scripts or .rotscr files is in the file README.autobondrot.



Troubleshooting


If you don't have Probe or if you have an old copy, you can get the latest release from our website; binary executable files are available for several operating systems along with source code. Make sure you download .tar.Z or .tgz files as BINARY. If you download an .exe file, you will probably wish to rename it to just "probe" and put it into a directory which is listed in your PATH environmental variable. For UNIX or LINUX you will also have to make it executable with the command: chmod +x probe

The source code should compile easily on almost all UNIX like systems. Copy the "Makefile" for your system (cp Makefile.xxx Makefile), check for any system specific issues (altering as required) and then type make probe

The most common problem using Probe is specifying selection patterns. Remember self dots (-self, the default) takes one pattern and then the filename, while interface dots (-both) take two patterns before the filename.

Output from Probe is generally designed to be appended to the end of a kinemage file. If you just want to see the dots without creating a model first, add @kinemage 1 as the first line to the dotfile (either by hand or using Probe flag: -kin) and Mage can now display it.



Program Information


We hope this helps you get started looking at molecular contact surfaces. To find the latest version of Probe, see http://kinemage.biochem.duke.edu/. A comprehensive description of the small-probe method is found in [1] and [2].

References

1) Word, et. al. (1999) Visualizing and Quantifying Molecular Goodness-of-Fit: Small-probe Contact Dots with Explicit Hydrogens, J. Mol. Biol. 285, 1711-1733.
2) Word, et. al. (1999) Asparagine and Glutamine: Using Hydrogen Atom Contacts in the Choice of Side-chain Amide Orientation, J. Mol. Biol. 285, 1735-1747.