Auxilliary Programs

These programs are all written in the Ocaml language, with the exception of born_integral, which is written in C++. In addition to the input flags described for each program, if the user uses the input flag -help, the program will print out a description of itself. The outputs of these programs eventually funnel into the simulation code (which is written in C++).

uhbd2dx

Converts UHBD grid file to an OpenDX file. Reads from standard input and writes to standard output.

pqr2xml

This program converts a PQR file, which describes the collection of charged spheres making up the molecule, to an equivalent XML file (PQRXML file). The reason for this is that most of the files processed by the UCBD software are in XML format. PQR files can be generated from PDB files using software included with APBS. More information on the PQR format, and the equivalent XML format, can be found in the APBS documention. The pqr2xml program receives the PQR file through standard input, and outputs to standard output. Example:

 cat mol.pqr | pqr2xml > mol.xml

residue_test_charges

This program reads a PQRXML file from standard input describing the charged spheres (output from pqr2xml), and outputs a test charge for each residue that has a total charge significantly greater than zero. Later, I want to use effective charges, but only after I've implemented a good way of dealing with their difficulties encountered at close range.

surface_spheres

This program reads a PQRXML file from standard input describing the charged spheres (output from pqr2xml), and outputs an XML file with four lists. The first list is a list of the triangles of the surface spheres; each triangle is a trio of integers, with each integer representing a sphere. These triangles come from an algorithm which is almost identical to that used in Michel Sanner's MSMS. A probe rolls across the molecule surface and touches three spheres at a time as it makes its way. The second list is a list of integers, each representing a surface sphere. The third list is a list of "insiders", or those spheres completely enclosed within a surface sphere. The fourth list is a list of "danglers", or those spheres that hang out into the solvent but could not be picked up by the ball-rolling algorithm. The following input flags are used:

Once the surface spheres and their triangles are computed, the program must then distinguish between the interior spheres and the danglers. A point is selected which is guaranteed to outside. Then, for each remaining sphere, a line segment is constructed running from the exterior point to the sphere center. The program counts how many surface triangles are intersected by the line segment (this is done using a log(n) algorithm so that every triangle does not need to be checked). The number of intersections denotes whether the sphere is inside or outside the cage of surface triangles. If the program is unlucky and the line hits a triangle edge, the program will perturb the exterior point slightly and try again.

Example:

cat mol.xml | surface_spheres -probe_radius 1.6 -direction 1 0 0 > mol-surface.xml

inside_points

This program outputs an XML file representing a rectangular grid of points, each with a 1 or 0 depending on whether the point is inside (1) or outside (0) the molecule. The program reads in the XML sphere data (from pqr2xml) and the surface information from surface_spheres. For this application, a point is "inside" if it meets at one of two criteria. First of all, if the point's distance from the surface of any surface or dangler atom is less than a certain exclusion distance set by the user, it is considered inside. Second, if the point is inside the cage of triangles formed by the surface spheres, it is considered to be inside the molecule. The lower corner, spacing, and number of points in each direction of the grid are set by the user. The 1's and 0's are output in order, starting from the lower corner, with the x-direction varying most rapidly. If the lower corner or spacing are not specified, reasonable and useful defaults are chosen. The following input flags are used:

Each grid point is tested for inside-ness in same manner as in surface_spheres above. To speed things up, if a point has been found to be inside, the distance to the nearest triangle is found, and all other points within that distance are immediately marked as "inside" as well. If the point is outside, then the distance to the nearest sphere surface is found, and the points within that distance are marked as "outside". This avoids having to find intersecting triangles for most of the grid points.

The big advantage of this algorithm is that it avoids marking interior molecular cavities as exterior. This is essential for efficiently lumping the effective charges together, as seen below.

Example:

inside_points -spheres mol.xml -surface mol-surface.xml -corner -11.1 22.2 -33.3 -spacing 0.5 -ngrid 65 65 65 -exclusion_distance 1.5 > mol-inside-pts.xml

grid_distances

Given the output of inside_points, this program outputs a like-shaped grid of distances. If the point from inside_points is 0, then the distance is that to the nearest point with a value of 1, and vice versa. The spacing between grid points is obtained from the output of inside_points, so it is not necessary to enter it separately. The algorithm time scales as N3/2, where N is the total number of grid points. The following input flag can be used:

Example:

cat mol-inside-pts.xml | grid_distances > mol-distances.xml

lumped_charges

The script lumped_charges takes as its input the xml file of effective charges from compute_effective_charges and outputs another xml file containing a hierarchical grouping of the charges. The key idea is that if the source of electric field is far away from a group of force centers and you want to compute the force and torque on the group, the group can be compressed into a smaller and faster data structure. Using the technique of Chebyshev interpolation, the group can be converted into a data structure that I'll call a chebybox. The chebybox has a rectangular array of 64 positions (4 by 4 by 4) where the electric potential is evaluated. The resulting force is a linear combination of contributions from each position multiplied by the electric potential evaluated at the position: F = i=063 Vi fi, where Vi is the electric potential evaluated at point i and f i is the contribution from point i. The torque is computed from a similar linear combination. Mathematically, the chebybox approximation is exact if the the potential is a cubic function.

Chebyboxes can be nested. If, during the course of a simulation, the field source comes closer to a group of force centers, then the chebybox representing the group might not provide accurate forces and torques, so the group must be split into two groups, each with its own chebybox. For example, near a point charge, the potential cannot be represented by one cubic function over a volume that is large compared to the distance from the charge. The decision is made by computing the ratio of distance of the box center from the field source, to the box diagonal length. If this ratio is below a certain threshhold, the box is divided. Finally, it is not worthwhile to use a chebybox if the number of force centers in a group is much less than 64; it is better to evaluate the force on each center explicitly.

In addition to an outer 4 by 4 by 4 chebybox, the program also generates an outer 3 x 3 x 3 chebybox, which can be used when the molecules are far apart.

The interior of a large molecule will never get close to another molecule, so time and space can be saved by not dividing large chebyboxes buried deep in the molecule. Therefore, one of the optional inputs is the XML file of distances from grid_distances; this tells the program how close is the chebybox to the molecule surface. The following input flags are used:

Example:

lumped_charges -pts mol-charges.xml -dist mol-distances.xml > mol-cheby.xml 

compute_effective_volumes

This program computes effective volumes for each sphere in the molecule. This is needed as an input to the part of the simulation code that computes the desolvation forces. It estimates the volume of space closest to each sphere by Monte Carlo integration, but only includes volume interior to the molecule, as represented by the XML file output from inside_points. If values for the solvent and molecule dielectrics are provided, then the output volumes are scaled in order to function as "effective charges" for computation of desolvation forces. The following input flags are used:

Example:

compute_effective_volumes -spheres mol.xml -inside mol-inside-pts.xml -solvent 79 -solute 4 > mol-volume.xml
This output file can also be fed into the lumped_charges program to generate a chebybox structure for computing desolvation forces:
lumped_charges -pts mol-volume.xml -dist mol-distances.xml > mol-vol-cheby.xml

ellipsoids

This program computes two ellipsoids from the effective volume data (output from compute_effective_volumes). The first ellipsoid is an ellipsoid with the same inertia matrix as the molecule, based on the volume data. The second ellipsoid is an estimate of the smallest ellipsoid bounding the spheres of the molecule. This is computed as follows. First, the molecule is stretched and compressed along its principal axes in order to make it more equivalent to a sphere. Then, the smallest bounding sphere is computed for the sphere centers, Finally, this bounding sphere is transformed back to make it an ellipsoid, and extra padding is added to account for the radii of the molecule spheres. These ellipsoids are used in computing hydrodynamic properties and time step sizes. The output is an XML file. Example:
cat mol-volumes.xml | ./ellipsoids > mol-ellipsoids.xml

mpole_grid_fit

This program computes a multipole fit (out to quadrupole level) to the outer points of the input grid. The fit is done by least squares on the surface of the largest sphere enclosed by the grid. The multipole information is output as an XML file. The following input flags are used:

Example:

cat mol.dx | mpole_grid_fit -debye 9.58 > mol-mpole.xml 

make_surface_sphere_list

This program generates an XML file (equivalent to PQR) of spheres on the surface. It reads in the surface and dangler spheres from the output of surface_spheres, as well as spheres from the reaction file (output of make_rxn_file) that have not been included by surface_spheres. The following input files are used:

Example:

make_surface_sphere_list -surface mol1-surface.xml -spheres mol1.xml -rxn1 mol-rxn.xml 

make_rxn_file

This program generates an XML file that represents a reaction and its criteria. The main inputs are the the two PDB files that are outputs from the da-pairs from the SDA package. The following input flags are used:

Example:

 make_rxn_file -files mol1.rxna mol2.rxna -distance 2.1 -nneeded 2 > mol-rxn.xml

normalized_rxn_pairs

This program renumbers the atom names in the reaction criterion to account for the fact that only surface atoms are read into the simulation programs and used for collision detection.

born_integral

This program computes the Born integral for a molecule:
where the integral is over the interior of the molecule, &lambda is the Debye length, and r is the distance from an exterior grid point to the interior point. This integral is computed for all exterior grid points; the geometry is given by the first input. A multipole method is used in order to speed up the execution. The contribution of a charge q to the desolvation energy of its molecule is given by the expression:
where &epsilonp and &epsilons are the dielectrics of the solute and solvent, &epsilon0 is the vacuum permittivity, r is the distance from the charge, and the integral is over the other molecule. This has the same form as the formula used in SDA, where the integral is approximated by a sum over spheres. The SDA formula, which was obtained from a different model by using an approximation, differs by a factor of 6.67 from the Born formula when solvent and solute dielectrics are 78 and 4.

monopole_fit

This program generates a single-charge fit from the electrostatic data. By default, it reads in an OpenDX file from standard input.

sphere_return_distribution

This program computes information used by the simulation program to run the enhanced Northrup-Allison-McCammon algorithm. It contains the time and angle distribution of trajectories that reach an outer radius and return to the inner ("b") radius. This enables us to avoid using a large "q" radius. It is an enhancement of an algorithm of Luty, Zhou, and McCammon, and will be submitted for publication.

compute_charges_squared

This program computes the square of the effective charges. It reads in a charge file from standard input, and outputs to standard output in the same XML format. This is used in computing the desolvation forces.

hydro_radius

This program computes the hydrodynamic radius of the molecule by computing the mean chord length (Hansen, 2004)

compute_rate_constant

This program takes as input the results file from nam_simulation and outputs the rates and their confidence intervals.

compute_rate_constant_we

This program takes as input the results file from we_simulation and outputs the rates and their confidence intervals.

bd_top

Front-end program