Generic Weighted Ensemble Dynamics Driver

The Weighted Ensemble Dynamics algorithm (Huber and Kim, Biophys. J, 1996) is useful for finding rates of rare events. It maintains several copies of the system of interest, and divides and merges copies to enhance sampling over free energy barriers along the reaction coordinate. It makes use of divisions of configuration space usually called bins, and approximately enforces the condition of having equal number of system copies in each bin.

This code can run several threads on any machine that has pthreads, which is standard on most architectures. Upon the first call to step_objects_forward, the threads are created and used to independently call step_forward on the objects. The threads are kept waiting between calls to step_objects_forward, and are sychronized so that the main thread does not return until all the threads are finished.

In addition to the simulation, this code can also generate sequential bins along a user-defined reaction coordinate. The user starts the system copies at one end of the coordinate, and the bins are generated by stepping the copies forward and forcing the ensemble along the coordinate.

The algorithm is wrapped in a generic class Sampler< Interface>, which sits in the namespace Weighted_Ensemble_Dynamics::Multithreaded. (I still have a single-threaded version which does not use pthreads at all; the multithreaded version sits on top of it.) The header file we_multithread.hh is included (which then pulls in other header files in the same directory), and the code must also be linked to the compiled result of barrier.cc. Both of these files are in the src directory. The Interface class, provided by the user, must have the following members:

typename Info - reference to an object that contains all the information about the system to be simulated, including the system copies and their weights. In most cases this will be pointer.
typename Object - reference to an individual object; used in conjunction with the Info reference. Often this will be a pointer, but one could also use an integer index, for example.
typename Exception - type of exception that is thrown from inside
void initialize_object( Info info, Object obj) - called on obj at the beginning of the simulation and whenever it is restarted after crossing a reaction boundary.
void step_forward( Info info, unsigned int ithread, Object obj) - the configuration of obj is stepped forward by one time step. Thread number ithread is the one doing the work. Threads are numbered starting with 0.
unsigned int bin( Info info, Object obj) - returns the integer designating the bin containing obj
unsigned int bin( Info info, Object obj, bool& exists) - returns the integer designating the bin containing obj. Also outputs into exists whether a bin actually exists. If exists is false after the function call, the integer return value is undefined. This function is necessary because no bins are defined yet during the bin-building process.
Object_Ref new_object( Info info) - creates a new object and returns its reference. It is the user's responsibility to assign an appropriate weight to the object.
void get_duplicated_objects( Info info, Object obj, unsigned int ncopies, std::list< Object>& objs)- places ncopies copies of obj into the list objs, which is initially empty. Also, the weight originally in obj is split evenly among obj and its copies. So, if obj initially has a weight of 1 and ncopies is 3, then after the call obj and the three new copies each have a weight of 0.25.
void remove( Objects objs, Object obj) - removes obj from the container.
void get_crossing( Info info, Object obj, bool& crossed, unsigned int& crossing) - We allow for more than one reaction condition, and each reaction condition is designated by a non-negative integer, starting at zero. This function tests the object obj to see if it has crossed. (crossed is set true if so, false otherwise). If there is a crossing, then crossing is set to the integer for that crossing.
unsigned int number_of_bins( Info info)
unsigned int number_of_bin_neighbors( Info info, unsigned int bin) - Each bin is adjacent to one or more other bins. For example, in a conventional reaction coordinate approach, each bin would have two neighbors, except for the ones at either end. We allow for more general bin connectivity, so this function returns the number of neighbors of bin.
template< class Function> void apply_to_bin_neighbors( Info info, unsigned int bin, Function&) - applies an arbitary function object Function to each reference of of neighbors of bin. Function will have the () operator overloaded to take an unsigned integer.
bool have_population_reversal( Info info, unsigned int bin0, double wt0, unsigned int bin1, double wt1) - At steady state, one expects some bins to have more system statistical weight than others. Due to fluctuations in the simulation, this isn't always so, so the algorithm will merge bins in order to get better convergence. This function tests the weights (wt0 and wt1) of the bins (bin0 and bin1) and returns true if the relative weights aren't what one would expect in steady state. This is optional, and one can always just return false and the algorithm will still work, but maybe not as efficiently.
void set_null( Info& info) - sets reference to a "null" value, which is used to denote an uninitialized reference.
bool is_null( Info info) - returns true if reference is to a "null" value.
double uniform( Info info) - returns a random number uniformly distributed between 0 and 1.
double weight( Info info, Obj obj) - weight of obj
void set_weight( Info info, double weight, Obj obj) - sets weight of obj to weight.
unsigned int number_of_threads( Info info) - number of threads used by system.
void output( Info info, unsigned int crossing, double prob) - informs the user's software that an amount prob of probability has crossed crossing since the last time step.
template< class Function> void apply_to_object_refs( Function& f, Info info) - templated function that applies an arbitrary function object f to each object. Function will have the () operator overloaded to take an object reference.
double reaction_coordinate( Info info, Obj obj) - returns value of reaction coordinate of obj. This is used only by the bin-building process.

The Sampler class itself has the following member functions:

void set_number_of_objects( unsigned int n) - this is the target number about which the total number of objects fluctuates at steady-state.
void set_number_of_threads( unsigned int) - sets number of threads. It is probably best not to use any more threads than processors. If the number of threads is not set, then only one thread is created.
void set_number_of_fluxes( unsigned int n) - number of different crossing, or reaction criteria. If not called, the number of crossings is one.
void set_random_number_generator( RNG_Ref)
void set_bins( Bins_Ref)
void set_object_container( Objects_Ref objs) - initially objs should be empty
void initialize_objects() - populates the object list with objects at the start of the process.
void step_objects_forward() - once everything is initialized, each object is stepped forward.
void set_number_of_moves_per_output( unsigned int n) - If set, flux information is output every n calls to step_objects_forward; otherwise, it is output after every call.
void renormalize_objects() - splits and deletes objects in order to maintain bin populations.
void generate_bins( Info info, std::list< double>& bins) - places the bin partitions into the initially empty list bins. At the end, the reaction coordinates are ordered from highest to zero. Before calling the function, the number of objects is set by set_number_of_objects, but the objects themselves are not yet created. Then, the function is called once.

Normally, after initializing everything, most of the work will take place in a loop such as this:

for( int i = 0; i < n_moves; i++){
  sampler.step_objects_forward();
  sampler.renormalize_objects();
}