One of the most useful tools in a Unix-style environment is the make utility, which keeps a collection of files generated by various programs up-to-date by running only the programs necessary to update files downstream from a modified file. However, it has its limitations. First of all, the "language" used by make is not a full-fledged programming language, and it would be useful to have the flexibility offered by a programming language. Secondly, it would be useful for such a utility to be able to pull values from a file and use them as arguments to a program, rather than be required to use a whole file as an argument.
The Orchestrator is an attempt to provide such a utility. It was written mainly as a utility for my molecular simulation package BrownDye, but I think that it could be useful for a wider range of applications. Right now, it is written in Ocaml, and the following documentation will reflect that. However, I do realize that Ocaml is not widely used on this side of the Atlantic, even though it is my favorite language, so one of my goals for this project is to rewrite the Orchestrator in Python, so it will be accessible to more people. But even if you don't know Ocaml, this documentation should give you an idea of what the Orchestrator does.
An Orchestrator script, which is analogous to a "Makefile", is composed of various source objects. The sources have one or more inputs, and one output. The output of a source can be the input to another source, and the result is a directed acyclic graph. A source can be updated, which results in the updating of all the sources which are upstream, or those sources which ultimately feed into it. Each source has a time associated with it, which is the time of the last update. In order for a source to be up-to-date, its time must be greater (newer) than any of the sources upstream. Often, this updating process will result in the running of a program to process files to create a new file.
The orchestrator module Orchestrator
introduces
two new data types. The first is defined as
type data =
| Int of int
| Float of float
| String of string
| Float3 of float * float * float
| Bool of bool
| Null
Because Ocaml is strongly typed, we have to explicitly define
a data type which can hold integers, floating-point values,
strings, and booleans.
(For my own convenience, I also let it represent a 3D vector of
floats.) The Python version will not have this issue. This
data type represents the data flowing between the sources.
The
The second data type is the source itself:
type source
Objects of this type are generated by the various functions below.
val new_in_file_source: string:file -> source
This function takes the name of a file and returns a source representing
that file. The time associated with this source is the time of the
file's last modification.
val new_command_source: string:command -> string:output_file -> source
This function takes two string arguments. The first argument is the
name of the program which is run to generate the output.
The second argument is the name of the file to which the standard
output of the program command
is directed.
The inputs to the command source are defined by the following functions:
val add_stdin_prereq: source:command -> source:prereq -> unit
val add_prereq: source:command -> string:tag -> source:prereq -> unit
val add_opt_prereq: source:command -> string:tag -> source:prereq -> unit
The function
Still under construction