BrownDye Tutorial

This tutorial will walk you through two different types of BD simulations on the Thrombin-Thrombomodulin system, which is an important component of the blood-clotting cascade. Thrombin consists of 295 amino acids, and thrombomodulin consists of 117 amino acids. (My thanks to Adam Van Wynsberghe for the necessary data on these two molecules.)

First, in order to use BrownDye on these machines, you must set your path to include the executable files:

PATH=$PATH:/u/ieng6/nbcr09/nbcr09/public/elstat/browndye/browndye/bin

The new directory contains the executables of BrownDye. You also need to copy the file /u/ieng6/nbcr09/nbcr09/public/elstat/browndye/thrombin-example.tar.gz to your home directory, and do

tar xvfz thrombin-example.tar.gz

from within your home directory. This will give you a directory "thrombin-example"; the commands of the tutorial example will be run from within this directory

The atomic coordinates for thrombin and thrombomodulin are in files t.pqr and m.pqr. Because BrownDye works primarily with XML files, you must convert this two files to an equivalent XML format:

pqr2xml < t.pqr > t-atoms.xml
pqr2xml < m.pqr > m-atoms.xml

Next, you must generate the electrostatic grids, in dx format, using APBS:

apbs t.in
apbs m.in

where the input files t.in and m.in are provided. The grids are output to t.dx and m.dx. Throughout this session, thrombin will be denoted by the prefix "t" while thrombomodulin will be denoted by prefix "m". Be sure to take note of the Debye length in the APBS output if you don't feel like calculating it by hand; it will be needed later.

In addition to atomic coordinates and grids, the third key input is the set of reaction criteria. This can be generated from the two coordinate files t-atoms.pqrxml and m-atoms.pqrxml, and a file, protein-protein-contacts.xml, which describes which pairs of atom types can define a contact. The program make_rxn_pairs takes these three files and a search distance to generate a file of reaction pairs. This assumes that the coordinates of the two molecules are consistent with the bound state.

make_rxn_pairs -mol1 t-atoms.pqrxml -mol2 m-atoms.pqrxml -ctypes protein-protein-contacts.xml -dist 5.5 > t-m-pairs.xml

The resulting file still is not suitable for input into the simulation programs, however. I've made the program general enough to have more than one reaction in a simulation, so one could envision having several such reaction pair files that would need to combined into a final reaction description file. For now, you can use the program make_rxn_file, which generates an input file for the case of one reaction:

make_rxn_file -pairs t-m-pairs.xml -distance 5.5 -nneeded 2 > t-m-rxns.xml

This generates a reaction description file which tells the simulation programs that if any 2 of the atom pairs approach within 5.5 Angstroms, a reaction occurs.

A note: if you don't feel like typing the above commands in, especially if you make changes and need to do it repeatedly, I have included a Makefile in the example directory. Typing

make all

should run the above commands.

The remaining pieces of information are contained in the file input.xml. It contains, among other things, information on the solvent, information on each molecule, and parameters governing the simulation itself. For the sake of efficiency, the larger molecule should be "molecule0" if there is a large size difference. Also, each molecule is assigned a prefix as mentioned above; these are used in naming the intermediate files that are generated in the next step:

bd_top input.xml

The bd_top program is written in Ocaml using a Unix "make"-like utility that I wrote to help orchestrate the creation of the files. Like "make", if an intermediate file is changed or replaced, running bd_top (which is analogous to a Makefile) will run only those commands necessary to re-generate files that depend on the updated file. Unlike "make", this utility, which I call the "Orchestrator", can also read information from xml files and have chains of dependent calculations (eventually I want to write a version in Python so it will look more familiar to most people). So, when the command is executed, the following files are generated for thrombin:

t-charges.xml - a set of effective charges
t-cheby.xml - Chebyshev interpolation data structure for effective charges
t-distances - a dx file containing the distances of each exterior point to the nearest interior point, and vice-versa
t-ellipsoids.xml - bounding ellipoid using for estimating hydrodynamic interactions
t-hydro-radius.xml - hydrodynamic radius
t-q2.xml - effective charges squared; used for the desolvation forces
t-q2-cheby.xml - Chebyshev interpolation data structure for effective charges squared
t-surface.xml - a list of atoms on the surface found by rolling a probe sphere.
t-surface-atoms.xml - actual list of surface atoms to be used for collision detection
t-volume.xml - list of effective volumes of each atom
t-mpole.xml - multipole expansion of effective charges
t-monopole.xml - total charge
ins-t.xml - grid of points denoting inside (1) or outside (0)

The corresponding files with prefix "m" are also generated for thrombomodulin.

The following files are generated for both molecules:

t-m-solvent.xml - solvent information
t-m-old-data.xml - previous data, used for detecting changes in parameters
t-m-normed-rxns.xml - because only the surface atoms are read into the simulation programs, the atoms in the reaction criteria must be renumbered so that they are in sequence. This is the resulting file.
t-m-return-dist.xml - information used for appropriately returning the molecule to the b-sphere; has data on the time and probability distributions.
t-m-simulation.xml - contains additional information generated by the auxilliary programs; is read directly into the simulation programs.

The nice thing about these intermediate files is that any of them can be replaced or changed, and then "bd_top" run again to update everything. For example, right now I'm using a simple test-charge approximation by default, but one could easily generate effective charges using another program such as SDA, convert the output into the appropriate XML format, replace "t-charges.xml" and "m-charges.xml", and run "bd_top" again.

Another useability note: if you want to clean things up, you can delete all *.dx and *.xml files; the two xml files that you need to get started are also available as "input.xml.bak" and "protein-protein-contacts.xml.bak".

At this point, you can choose to do a simulation of one trajectory at a time, or you can do a weighted-ensemble simulation. In general, the weighted-ensemble method is not as efficient at the single-trajectory method unless the probability of a reaction event is very low.

To perform the single-trajectory simulation, the following is executed:

nam_simulation t-m-simulation.xml

The results end up in "results.xml", as designated in "input.xml". As the simulation proceeds, you can look at "results.xml" to see the simulation progress. (This is called "nam_simulation" after Northrup, Allison, and McCammon, who came up with the first algorithm of this type. This code uses a fancy variation on the orginal algorithm.).

At any point, you can use "compute_rate_constant" to analyze "results.xml" and obtain an estimate of and 95% confidence bounds on the reaction rate constant in units of M/s. The file "t-m-solvent.xml" must also be given to this program:

cat results.xml | compute_rate_constant

This will put the rate constant results to standard output.

To perform the weighted-ensemble method, you must first generate the bins for the system copies:

build_bins t-m-simulation.xml

The number of system copies used in the bin-building process are given in "input.xml" in the "n-copies" tag. As it runs, reaction coordinate numbers will go scrolling past; they should keep getting smaller and eventually stop. If that does not happen, i.e., the numbers keep on going, you might need to increase the number of system copies, or it might be that your reaction criterion is unattainable. Assuming is converges, the bin information is place in "t-m-bins.xml". The actual weighted-ensemble simulation is then run:

we_simulation t-m-simulation.xml

As before, the results are output to "results.xml". In each row of output numbers, the right-most number is the flux of system copies that escaped without reaction, while the other ones are reactive fluxes. So, even for a rare reaction event, you should at least see small numbers for the reactive fluxes after the system has reacted steady-state. This can be visually examined at any point, and can also be analyzed as above, but with a different program:

compute_rate_constant_we -sim results.xml -solvent t-m-solvent.xml

Because the streams of numbers are autocorrelated, a more sophisticated approach for computing confidence intervals is used, and if there are not enough data points, the program "compute_rate_constant_we" will simply refuse to provide an answer.

Ideas for fun

You can change the random number generator seed, under the "seed" tag. Good to do if you're bored but don't have the energy to do anything else.

One parameter to play with is the reaction criterion distance, which is the "-distance" input to the program "make_rxn_file". The number given in the tutorial and the Makefile is 5.0, but you can change that by re-running "make_rxn_file" or by changing it in the Makefile.

You can also change the ionic strength in files "t.in" and "m.in" and generate new APBS grids. Note: you must take note of the new Debye length and put that value in the file "input.xml". So far, it is not possible to automatically get the Debye length from the output DX file of APBS (another good argument for using XML!).

Although I'm pretty sure that the machines you're working on are single-processor machines, if you have access to a machine with several processors, you can change the value under the tag "n-threads" in file "input.xml" and see it run under several processors. So far, it runs only on shared-memory machines using "pthreads". Let me know if you want to try it on another machine; you can copy the BrownDye distribution over and compile it if you have Ocaml, or you might be able to copy the executables over if you don't. (Ocaml is quite easy to install.) Even on a single-processor machine, you can still run several threads, but it does not make the programs go any faster.

A final useability note: Most of the programs will output a description of themselves and their options if you type in

program -help

References

Ermak, DL and McCammon, JA. Brownian Dynamics with Hydrodynamic Interactions, J. Chem. Phys. 69, 1352-1360 (1978)

Northrup SH, Allison SA and McCammon JA. Brownian Dynamics Simulation of Diffusion-Influenced Bimolecular Reactions J. Chem. Phys. 80, 1517-1526

Luty BA, McCammon JA and Zhou HX. Diffusive Reaction-Rates From Brownian Dynamics Simulations - Replacing the Outer Cutoff Surface by an Analytical Treatment, J. Chem. Phys. 97, 5682-5686 (1992)

Huber GA and Kim S. Weighted-ensemble Brownian dynamics simulations for protein association reactions, Biophys. J 70, 97-110 (1996)

Gabdoulline RR and Wade RC. Effective charges for macromolecules in solvent, J. Phys. Chem 100, 3868-3878 (1996)