PTL Logo

Fault Tolerance Research @ Open Systems Laboratory

CIFTS in Open MPI

  •  

Installation Guide

This guide describes the steps involved in setting up Open MPI with Fault Tolerance Backplane (FTB) support. FTB support in Open MPI is optional and turned off by default. Following these instructions, one should be able to throw events from Open MPI to the FTB.

Installing FTB

  • Fetch the latest version of FTB (0.6.1) from the CIFTS downloads page.
  • Detailed instructions for installing FTB can be found in the docs/chapters/ directory of the downloaded tarball. The CIFTS wiki has an online version of the installation manual.

Installing Open MPI with FTB support

FTB support in Open MPI is currently available in the development trunk. Consult the Open MPI website for details on how to access the code either through a Subversion checkout or nightly tarball.

  • See the page Requirements to Build from a Subversion Checkout for directions on building a SVN checkout of Open MPI.
  • In addition to your desired configure flags, FTB support specific flags have to be added to enable the FTB notifier interface. The two FTB specific flags include:
    If the FTB libraries are installed at /opt/, the configure command would be
      shell$ ./configure --with-ftb=/opt/ <configure args>
      

Running Open MPI with FTB backplane

  • Start the FTB database server and the FTB agent, per the instructions given in the Launching FTB section of the wiki. The database server can be started on the head node whereas the FTB agents are to be started on the individual nodes. It can be made sure that the FTB backplane is running correctly by running the examples found in the components/examples directory of the FTB installation.
  • To relay Open MPI failure events or other fault related information, run the MPI application by enabling the Open MPI FTB notifier component.
      shell$ mpirun -np <num_nodes> --mca notifier ftb my-app
      
    More information about the runtime parameters specific to the FTB component can be found in the API section.
  • To test the Open MPI FTB notifier interface, run the orte_notifier_ftb test in the test/runtime/ directory of the Open MPI installation.