BrownDye Tutorial

This tutorial will walk you through two different types of BD simulations on the Thrombin-Thrombomodulin system, which is an important component of the blood-clotting cascade. Thrombin consists of 295 amino acids, and thrombomodulin consists of 117 amino acids. (My thanks to Adam Van Wynsberghe for the necessary data on these two molecules.) Here we treat each molecule as a rigid body, or "core". Also, the actual reaction rate is much smaller that what it calculated in this tutorial. However, we want something that runs on one processor for just a few minutes in order to give you a taste of the software, so I have increased the reaction distance (described below) from 5 Angstroms to 15 Angstroms so you can actually obtain a "rate" from only 1000 trajectories.

BrownDye can be easily installed on Linux or Mac OS X. If you have Windows, you can install Cygwin , which provides a free Linux emulator, and then to install BrownDye using the Cygwin terminal. Go here for more details. Other software required will be APBS , Ocaml, and VMD for the visualization.

Also, be sure to check out the website for more up-to-date installation instructions.

Running on a Linux Computer

Other necessary software is the Ocaml compiler, which is free and can be obtained from here . If you are using Ubuntu or a similar Linux based on Debian, you can just type

sudo apt install ocaml

You will also need APBS; this can also quickly be installed on Ubuntu by typing

sudo apt install apbs

The following steps will also work on the Mac terminal and Windows Cygwin terminal. You can install BrownDye on your machine by going to the website , downloading the file browndye.tar.gz, moving the file to a directory of your choice, and typing

tar xvfz browndye.tar.gz
cd browndye
make all

All of the executables are now in the directory browndye/bin; you need to add this directory to your path. If you use a bash shell, this can be done by typing

PATH=$PATH:fullpath/browndye/bin

where fullpath is the full path name of the directory containing the browndye distribution. If you now want to run the thrombin-thrombomodulin tutorial, go to browndye/examples/thrombin.

The atomic coordinates for thrombin and thrombomodulin are in files t.pqr and m.pqr. Because BrownDye works primarily with XML files, you must convert these two files to an equivalent XML format:

pqr2xml < t.pqr > t_atoms.xml
pqr2xml < m.pqr > m_atoms.xml

Next, you must generate the electrostatic grids, in dx format, using APBS:

apbs t.in
apbs m.in

where the input files t.in and m.in are provided. The grids are output to t.dx and m.dx. Be sure to take note of the Debye length in the APBS output if you don't feel like calculating it by hand; it will be needed later. (Some older versions of APBS will name the output files t-PE0.dx and m-PE0.dx; if that happens, be sure to rename the files.)

In addition to atomic coordinates and grids, the third key input is the set of reaction criteria. This can be generated from the two coordinate files t_atoms.xml and m_atoms.xml, and a file, protein_protein_contacts.xml, which describes which pairs of atom types can define a contact (It is actually stored as protein_protein_contacts.xml.bak; you need to copy from this file.) The program make_rxn_pairs takes these three files and a search distance to generate a file of reaction pairs. This assumes that the coordinates of the two molecules are consistent with the bound state.

make_rxn_pairs -mol0 t_atoms.xml -mol1 m_atoms.xml -ctypes protein_protein_contacts.xml -dist 15 > reaction_pairs.xml

The resulting file still is not suitable for input into the simulation programs, however. I've made the program general enough to have more than one reaction in a simulation, so one could envision having several such reaction pair files that would need to combined into a final reaction description file. For now, you can use the program make_rxn_file, which generates an input file for the case of one reaction:

  make_rxn_file -pairs reaction_pairs.xml -state_from before -state_to after -rxn association -mol0 thrombin thrombin -mol1 tmodulin tmodulin -distance 15.0 -nneeded 3 > reactions.xml

This generates a reaction description file which tells the simulation programs that if any 3 of the atom pairs approach within 15 Angstroms, a reaction occurs.

A note: if you don't feel like typing the above commands in, especially if you make changes and need to do it repeatedly, I have included a Makefile in the example directory. Typing

make all

should run the above commands.

The remaining pieces of information are contained in the file input.xml. (This must be copied from the file input.xml.bak.) It contains, among other things, information on the solvent, information on each molecule, and parameters governing the simulation itself. For the sake of efficiency, the larger molecule should be contained in the first group if there is a large size difference. Also, each molecule is assigned a prefix from the name of core; these are used in naming the intermediate files that are generated in the next step:

bd_top input.xml

The bd_top program is written in Ocaml using a Unix "make"-like utility that I wrote to help orchestrate the creation of the files. Like "make", if an intermediate file is changed or replaced, running bd_top (which is analogous to a Makefile) will run only those commands necessary to re-generate files that depend on the updated file. Unlike "make", this utility, which I call the "Orchestrator", can also read information from xml files and have chains of dependent calculations (eventually I want to write a version in Python so it will look more familiar to most people). So, when the command is executed, the following files are generated for thrombin:

thrombin_charges.xml - a set of effective charges
thrombin_hydro_radius.xml - hydrodynamic parameters
thrombin_charges_squared.xml - effective charges squared; used for the desolvation forces
thrombin_squared_cheby.xml - Chebyshev interpolation data structure for effective charges squared; used for desolvation forces
thrombin_surface.xml - a list of atoms on the surface found by rolling a probe sphere.
thrombin_surface_atoms.xml - actual list of surface atoms to be used for collision detection
thrombin_mpole.xml - multipole expansion of effective charges
thrombin_inside.xml - grid of points denoting inside (1) or outside (0)
thrombin_born.dx - grid of values used to estimate the desolvation energy; uses a Born model.

The corresponding files with prefix "tmodulin" are also generated for thrombomodulin. In addition, the following file is generated:

tmodulin_cheby.xml - Chebyshev interpolation data structure for effective charges; used for Coulombic forces

The following files are generated for both molecules:

thrombin_tmodulin_solvent.xml - solvent information
thrombin_tmodulin_simulation.xml - contains additional information generated by the auxilliary programs; is read directly into the simulation programs.

The nice thing about these intermediate files is that any of them can be replaced or changed, and then bd_top run again to update everything. For example, right now I'm using a simple test-charge approximation by default, but one could easily generate effective charges using another program such as SDA, convert the output into the appropriate XML format, replace thrombin_charges.xml and tmodulin_charges.xml, and run bd_top again.

Another useability note: if you want to clean things up, you can delete all *.dx and *.xml files; the two xml files that you need to get started are also available as input.xml.bak and protein_protein_contacts.xml.bak.

At this point, you can choose to do a simulation of one trajectory at a time, or you can do a weighted-ensemble simulation. In general, the weighted-ensemble method is not as efficient at the single-trajectory method unless the probability of a reaction event is very low.

One trajectory at a time

To perform the single-trajectory simulation, the following is executed:

nam_simulation thrombin_tmodulin_simulation.xml

The results end up in results.xml, as designated in input.xml. As the simulation proceeds, you can look at results.xml to see the simulation progress. (This is called nam_simulation after Northrup, Allison, and McCammon, who came up with the first algorithm of this type. This code uses a fancy variation on the orginal algorithm.).

At any point, you can use compute_rate_constant to analyze results.xml and obtain an estimate of and 95% confidence bounds on the reaction rate constant in units of M/s:

compute_rate_constant < results.xml

This will put the rate constant results to standard output.

Visualization

To visualize a trajectory leading to a reaction, you must include the line

<trajectory_file> trajectory </trajectory_file>

in the input file, run bd_top and nam_simulation again. The value trajectory gives its names to the resulting trajectory files, and can be what you want. Because the files can get quite large, you might want to also include a line such as

< n_steps_per_output > 10 < /n_steps_per_output >

which causes only every 10th configuration to be output. At the end, the files trajectory0.xml and trajectory0.index.xml will be generated. The first file contains the actual trajectories in a compressed format, and the second file functions as an index to the first file. If you are running with more than one thread, more files like this will also be generated, with names trajectory1.xm1 , trajectory1.index.xml , etc., with one pair of files per thread. After the initial trajectory files are generated, they must be processed further. For example, if you wish to view a trajectory that results in a reaction, you must find such a trajectory by running

process_trajectories -traj trajectory0.xml -index trajectory0.index.xml -srxn association

which prints out the numbers of reactive trajectories. Then, pick one of the numbers, say 8, and run

process_trajectories -traj trajectory0.xml -index trajectory0.index.xml -n 8 > trajectory.xml

This uncompresses the contents of trajectory0.xml and outputs the desired trajectory into trajectory.xml. At this point, you can cut down even further on the number of configurations by adding to the above to get

process_trajectories -traj trajectory0.xml -index trajectory0.index.xml -n 8 -nstride 10 > trajectory.xml

which outputs only every tenth configuration in trajectory0.xml. The final step is to generate an vtf-format file for visualization by VMD, which is obtained by running

vtf_trajectory -traj trajectory.xml > trajectory.vtf

This file can be very large, since all of the atom positions for every configuration are output to traj.vtf, so it can be important to reduce the number of trajectories upstream. This file then can be passed to VMD for visualization. If you are worried about the size of the file, you can run

vtf_trajectory -mol0 t_atoms.xml -mol1 m_atoms.xml -trajf trajectory.xml -trial

which will output nothing except an estimate of the file size. If it looks like it might be too large, you can go back and increase the stride argument to process_trajectories . At last, you can start up VMD, load in the vtf file, and watch the animation.

Weighted Ensemble Method

To perform the weighted-ensemble method, you must first generate the bins for the system copies:

build_bins thrombin_tmodulin_simulation.xml

The number of system copies used in the bin-building process are given in input.xml in the n_copies tag. As it runs, reaction coordinate numbers will go scrolling past; they should keep getting smaller and eventually stop. If that does not happen, i.e., the numbers keep on going, you might need to increase the number of system copies, or it might be that your reaction criterion is unattainable. Assuming it converges, the bin information is placed in t_m_bins.xml. The actual weighted-ensemble simulation is then run:

we_simulation thrombin_tmodulin_simulation.xml

As before, the results are output to results.xml. In each row of output numbers, the right-most number is the flux of system copies that escaped without reaction, while the other ones are reactive fluxes. So, even for a rare reaction event, you should at least see small numbers for the reactive fluxes after the system has reacted steady-state. This can be visually examined at any point, and can also be analyzed as above, but with a different program:

compute_rate_constant_we < results.xml

Because the streams of numbers are autocorrelated, a more sophisticated approach for computing confidence intervals is used, and if there are not enough data points, the program compute_rate_constant_we will simply refuse to provide an answer.

Ideas for fun

You can change the random number generator seed, under the seed tag. Good to do if you're bored but don't have the energy to do anything else.

One parameter to play with is the reaction criterion distance, which is the -distance input to the program make_rxn_file. The number given in the tutorial and the Makefile is 15.0, but you can change that by re-running make_rxn_file or by changing it in the Makefile.

You can also change the ionic strength in files t.in and m.in and generate new APBS grids. Note: you must take note of the new Debye length and put that value in the file input.xml. So far, it is not possible to automatically get the Debye length from the output DX file of APBS.

If your machine has several processors, you can change the value under the tag n_threads in file input.xml and see it run under several processors. So far, it runs only on shared-memory machines using pthreads. Even on a single-processor machine, you can still run several threads, but it does not make the programs go any faster.

A final useability note: Most of the programs will output a description of themselves and their options if you type in

program -help

Notes on Running BrownDye on Windows and Macintosh

The Cygwin package is easy to install; you go to the website and download and run the setup.exe file. The only part that requires some effort is finding the necessary packages to install. If you do nothing, Cygwin installs a minimal subset, but to run Browndye you need gcc, make, autoconf, and ocaml. You could install all optional packages, but it makes for a huge and lengthy download. Once Cygwin is installed, you can pop open a terminal window, giving you access to a directory tree just like any other Unix system. If you are missing packages, you can run setup.exe again, which will again present you the choice of packages.

In order to obtain gcc for OS X on the Mac, you need to go to Apple's website and register as a developer. If you are doing your work in a university, this is free. Then, you install the XCode Developer tools, which include gcc. Ocaml is available for the Mac from the Ocaml website.

References

Huber, GA and McCammon, JA. Brownian Dynamics Simulations of Biological Molecules Trends in Chemistry 1, 727-738 (2019)

Huber, GA and McCammon, JA. Browndye: A Software Package for Brownian Dynamics. Computer Physics Communications 181, 1896-1905 (2010)

Ermak, DL and McCammon, JA. Brownian Dynamics with Hydrodynamic Interactions, J. Chem. Phys. 69, 1352-1360 (1978)

Northrup SH, Allison SA and McCammon JA. Brownian Dynamics Simulation of Diffusion-Influenced Bimolecular Reactions J. Chem. Phys. 80, 1517-1526

Luty BA, McCammon JA and Zhou HX. Diffusive Reaction-Rates From Brownian Dynamics Simulations - Replacing the Outer Cutoff Surface by an Analytical Treatment, J. Chem. Phys. 97, 5682-5686 (1992)

Huber GA and Kim S. Weighted-ensemble Brownian dynamics simulations for protein association reactions, Biophys. J 70, 97-110 (1996)

Gabdoulline RR and Wade RC. Effective charges for macromolecules in solvent, J. Phys. Chem 100, 3868-3878 (1996)