The Application Checkpoint/Restart Library project abstracts the application away from the, often system specific, checkpoint storage and notification mechanisms by using an interposition library. This library essentially wraps the existing checkpoint and restart functionality providing just enough separation between the system and the application C/R functionality for the library to efficiently coordinate their respective activities.
The principle goals of this project are to:
The Application Checkpoint/Restart Library source code is currently hosted at the site below:
For instructions on how to build and install from source see the Installation page.
Currently, only the CIFTS Fault Tolerance Backplane (FTB) events described in the API Reference are supported. More events may be added in the future.
Currently, the Application C/R Library is shipped as part of the Open MPI project. The only reason for this dependency is because we were too lazy to make our own build system. So we piggyback on theirs at the moment. In a later revision we will separate the two projects more cleanly.