Profiling of OpenFOAM solvers

Profiling of OpenFOAM solvers

Profiling your HPC applications can be of great use if one wants to investigate the performance. This is normally not done as routine when using "off-the-shelf software". However, if you know that you are going to use a lot of resources and time on a specific code, it might be worth investigating what settings that give the best performance and "value for money" when it comes to using the assigned CPU hours in an efficient way.

This small guide will show the procedure of linking the IPM (integrated performance monitoring) profiler with OpenFOAM, however the procedure is identical if you want to link for example Darshan to OpenFOAM as well.

Using LD_PRELOAD (does not work at the moment)

The quick and dirty way to link IPM and OpenFOAM together is to set the LD_PRELOAD environment variable to the path to IPM's dynamic library. This tells the linker to link IPM to OpenFOAM at execution time, without the need for recompiling etc. LD_PRELOAD will in fact be set as soon as you load the IPM module in Vilje, so the only thing you need to do is to load the package in your job scrip right before or after you load the OpenFOAM package.

Note: This function is BROKEN in combination with the MPT-MPI versions currently installed on Vilje (as of aug. 2012), and setting LD_PRELOAD will cause MPT to crash with the following error message:

Terminal window

MPI: MPI_COMM_WORLD rank 104 has terminated without calling MPI_Finalize()
MPI: aborting job
MPI: Received signal 9

This is not an OpenFOAM-specific issue, and applies to all MPI applications in combination with MPT-MPI and LD_PRELOAD.

Recompiling OpenFOAM solver

Since the quick and dirty way described above does not work we have to find another way to tell the system's linker that OpenFOAM should link to IPM. This can be done by recompiling the solver we are interested in profiling.

The following guide is based on the guides found at the unofficial OpenFOAM Wiki and the OpenFOAM documentation. We are using the widely used pisoFoam solver as an example, but this guide should apply to all solvers and other applications (such as snappyHexMesh). We are going to use the most recent OpenFOAM version, that is 2.1.1 at the time of writing, but again, this should work with other versions as well.

Load OpenFOAM, MPT, IPM and the compiler from Intel:
Terminal window
module load openfoam module load mpt module load intelcomp module load ipm unset LD_PRELOAD
The LD_PRELOAD is unset to avoid the system linker to link all the applications we are running to IPM.

Copy the solver sources from the current location to a new location in your home directory:

Terminal window

mkdir -p $WM_PROJECT_USER_DIR/applications/solvers/incompressible
cp -r $FOAM_APP/solvers/incompressible/pisoFoam $WM_PROJECT_USER_DIR/applications/solvers/incompressible/pisoFoamIPM

We have here given the solver a new name, that is pisoFoamIPM to distinguish it from the non-IPM version.

You should rename pisoFoam.C to pisoFoamIPM.C, and the pisoFoam.dep can be deleted:Terminal windowmv pisoFoam.C pisoFoamIPM.Crm pisoFoam.dep
Enter the Make directory, and delete the linux64IccDPOpt directory:Terminal windowcd Makerm -r linux64IccDPOpt
Change the contents of the files-file to be:Make/filespisoFoamIPM.C EXE = $(FOAM_USER_APPBIN)/pisoFoamIPM
Now here comes the real deal: open the options file and specify that we shall link against IPM:Make/optionsEXE_INC = \-I$(LIB_SRC)/turbulenceModels/incompressible/turbulenceModel \-I$(LIB_SRC)/transportModels \-I$(LIB_SRC)/transportModels/incompressible/singlePhaseTransportModel \-I$(LIB_SRC)/finiteVolume/lnInclude EXE_LIBS = \-lincompressibleTurbulenceModel \-lincompressibleRASModels \-lincompressibleLESModels \-lincompressibleTransportModels \-lfiniteVolume \-lmeshTools \-L$(IPM_LIBPATH) \-lipmIf you want to link to something else than IPM, for example Darshan, you must specify the path to the Darshan shared library after the capital L (replacing $(IPM_LIBPATH)) and replace -lipm by -ldarshan.
Go one level down from the Make directory, so that you are in the pisoFoamIPM directory. We are now ready to compile our solver! This is done by a simple wmake command, and should be fairly quick: Terminal windowcd ..wmake
You can check if the solver has compiled properly by running pisoFoamIPM -help. This command should give some output. If it fails, something is wrong.

Now you must remember to also load IPM every time you are running pisoFoamIPM, and unset the LD_PRELOAD variable to avoid loading it twice (or crashing MPT as described above).

Using MPI_Pcontrol to analyze performance

If you are working on improving the performance of a certain piece of a solver, and you want to profile that part in special, that can be done by IPM. If we look in the small (but helpful) IPM user guide we see that we can insert the MPI_Pcontrol function with the parameters 1 (to indicate the start of a section) and -1 (to indicate the end of a section) to let IPM distinguish between the different code sections.

The only problem is that the OpenFOAM solvers do not link to MPI directly, since the MPI implementation is hidden inside the Pstream class. Therefore, we must edit the dependencies of our newly created solver. This is done by adding two lines in the top of the Make/options file:Make/options

sinclude $(GENERAL_RULES)/mplib$(WM_MPLIB)sinclude $(RULES)/mplib$(WM_MPLIB) EXE_INC = \-I$(LIB_SRC)/turbulenceModels/incompressible/turbulenceModel \-I$(LIB_SRC)/transportModels \-I$(LIB_SRC)/transportModels/incompressible/singlePhaseTransportModel \-I$(LIB_SRC)/finiteVolume/lnInclude EXE_LIBS = \-lincompressibleTurbulenceModel \-lincompressibleRASModels \-lincompressibleLESModels \-lincompressibleTransportModels \-lfiniteVolume \-lmeshTools \-L$(IPM_LIBPATH) \-lipm

Then the time is come to edit the solver source pisoFoamIPM.C itself. The first thing you need to do is to include #include "mpi.h" at the top of the file. Then you can insert MPI_Pcontrol calls as it suits you. This is the results of our editing (you might of course create different regions):pisoFoamIPM.C

#include "mpi.h"#include "fvCFD.H"#include "singlePhaseTransportModel.H"#include "turbulenceModel.H" // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // int main(int argc, char *argv[]){#include "setRootCase.H" MPI_Pcontrol(1, "startup");#include "createTime.H"#include "createMesh.H"#include "createFields.H"#include "initContinuityErrs.H"MPI_Pcontrol(-1, "startup"); // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Info<< "\nStarting time loop\n" << endl; while (runTime.loop()){Info<< "Time = " << runTime.timeName() << nl << endl; #include "readPISOControls.H"#include "CourantNo.H" // Pressure-velocity PISO corrector{// Momentum predictorMPI_Pcontrol(1, "momentumPredictor");fvVectorMatrix UEqn(fvm::ddt(U)+ fvm::div(phi, U)+ turbulence->divDevReff(U)); UEqn.relax(); if (momentumPredictor){solve(UEqn == -fvc::grad(p));}MPI_Pcontrol(-1, "momentumPredictor"); // --- PISO loop for (int corr=0; corr<nCorr; corr++){MPI_Pcontrol(1, "fluxCalc");volScalarField rAU(1.0/UEqn.A()); U = rAU*UEqn.H();phi = (fvc::interpolate(U) & mesh.Sf())+ fvc::ddtPhiCorr(rAU, U, phi); adjustPhi(phi, U, p);MPI_Pcontrol(-1, "fluxCalc"); // Non-orthogonal pressure corrector loopMPI_Pcontrol(1, "pressureCorrector");for (int nonOrth=0; nonOrth<=nNonOrthCorr; nonOrth++){// Pressure corrector fvScalarMatrix pEqn(fvm::laplacian(rAU, p) == fvc::div(phi)); pEqn.setReference(pRefCell, pRefValue); if(corr == nCorr-1&& nonOrth == nNonOrthCorr){pEqn.solve(mesh.solver("pFinal"));}else{pEqn.solve();} if (nonOrth == nNonOrthCorr){phi -= pEqn.flux();}}MPI_Pcontrol(-1, "pressureCorrector"); MPI_Pcontrol(1, "contErrors");#include "continuityErrs.H"MPI_Pcontrol(-1, "contErrors"); MPI_Pcontrol(1, "momentumCorrector");U -= rAU*fvc::grad(p);U.correctBoundaryConditions();MPI_Pcontrol(-1, "momentumCorrector");}}MPI_Pcontrol(1, "turbulenceModel");turbulence->correct();MPI_Pcontrol(-1, "turbulenceModel"); runTime.write(); Info<< "ExecutionTime = " << runTime.elapsedCpuTime() << " s"<< " ClockTime = " << runTime.elapsedClockTime() << " s"<< nl << endl;} Info<< "End\n" << endl; return 0;}

After this is finished, re-compile the solver with the wmake command.

When you now run the pisoFoamIPM application, we will get information about how long time the solver spent inside the individual regions, i.e. how long time it used to solve the momentum equations, pressure correctors and execute the turbulence model. Communication and MPI statistics is also grouped and displayed on a per-region basis as well as general statistics for the entire solver. Note that there are pieces of code in our solver that is not part of any explicit region, and they are handled together as if they were a separate region (called ipm_noregion).