Profiling of OpenFOAM solvers
Profiling your HPC applications can be of great use if one wants to investigate the performance. This is normally not done as routine when using "off-the-shelf software". However, if you know that you are going to use a lot of resources and time on a specific code, it might be worth investigating what settings that give the best performance and "value for money" when it comes to using the assigned CPU hours in an efficient way.
This small guide will show the procedure of linking the IPM (integrated performance monitoring) profiler with OpenFOAM, however the procedure is identical if you want to link for example Darshan to OpenFOAM as well.
Using LD_PRELOAD (does not work at the moment)
The quick and dirty way to link IPM and OpenFOAM together is to set the LD_PRELOAD
environment variable to the path to IPM's dynamic library. This tells the linker to link IPM to OpenFOAM at execution time, without the need for recompiling etc. LD_PRELOAD
will in fact be set as soon as you load the IPM module in Vilje, so the only thing you need to do is to load the package in your job scrip right before or after you load the OpenFOAM package.
Note: This function is BROKEN in combination with the MPT-MPI versions currently installed on Vilje (as of aug. 2012), and setting LD_PRELOAD will cause MPT to crash with the following error message:
Terminal window
MPI: MPI_COMM_WORLD rank 104 has terminated without calling MPI_Finalize() MPI: aborting job MPI: Received signal 9 |
This is not an OpenFOAM-specific issue, and applies to all MPI applications in combination with MPT-MPI and LD_PRELOAD.
Recompiling OpenFOAM solver
Since the quick and dirty way described above does not work we have to find another way to tell the system's linker that OpenFOAM should link to IPM. This can be done by recompiling the solver we are interested in profiling.
The following guide is based on the guides found at the unofficial OpenFOAM Wiki and the OpenFOAM documentation. We are using the widely used pisoFoam
solver as an example, but this guide should apply to all solvers and other applications (such as snappyHexMesh
). We are going to use the most recent OpenFOAM version, that is 2.1.1 at the time of writing, but again, this should work with other versions as well.
- Load OpenFOAM, MPT, IPM and the compiler from Intel:
Terminal window module load openfoam module load mpt module load intelcomp module load ipm unset LD_PRELOAD
LD_PRELOAD
is unset to avoid the system linker to link all the applications we are running to IPM. - Copy the solver sources from the current location to a new location in your home directory:
Terminal window mkdir -p $WM_PROJECT_USER_DIR/applications/solvers/incompressible cp -r $FOAM_APP/solvers/incompressible/pisoFoam $WM_PROJECT_USER_DIR/applications/solvers/incompressible/pisoFoamIPM
pisoFoamIPM
to distinguish it from the non-IPM version. - You should rename
pisoFoam.C
topisoFoamIPM.C
, and thepisoFoam.dep
can be deleted:Terminal windowmv
pisoFoam.C pisoFoamIPM.C
rm
pisoFoam.dep
- Enter the
Make
directory, and delete thelinux64IccDPOpt
directory:Terminal windowcd
Make
rm
-r linux64IccDPOpt
- Change the contents of the
files
-file to be:Make/filespisoFoamIPM.C
EXE = $(FOAM_USER_APPBIN)/pisoFoamIPM
- Now here comes the real deal: open the
options
file and specify that we shall link against IPM:Make/optionsEXE_INC = \
-I$(LIB_SRC)/turbulenceModels/incompressible/turbulenceModel \
-I$(LIB_SRC)/transportModels \
-I$(LIB_SRC)/transportModels/incompressible/singlePhaseTransportModel \
-I$(LIB_SRC)/finiteVolume/lnInclude
EXE_LIBS = \
-lincompressibleTurbulenceModel \
-lincompressibleRASModels \
-lincompressibleLESModels \
-lincompressibleTransportModels \
-lfiniteVolume \
-lmeshTools \
-L$(IPM_LIBPATH) \
-lipm
If you want to link to something else than IPM, for example Darshan, you must specify the path to the Darshan shared library after the capitalL
(replacing$(IPM_LIBPATH)
) and replace-lipm
by-ldarshan
. - Go one level down from the
Make
directory, so that you are in thepisoFoamIPM
directory. We are now ready to compile our solver! This is done by a simplewmake
command, and should be fairly quick: Terminal windowcd
..
wmake
- You can check if the solver has compiled properly by running
pisoFoamIPM -help
. This command should give some output. If it fails, something is wrong.
Now you must remember to also load IPM every time you are running pisoFoamIPM
, and unset the LD_PRELOAD
variable to avoid loading it twice (or crashing MPT as described above).
Using MPI_Pcontrol to analyze performance
If you are working on improving the performance of a certain piece of a solver, and you want to profile that part in special, that can be done by IPM. If we look in the small (but helpful) IPM user guide we see that we can insert the MPI_Pcontrol
function with the parameters 1
(to indicate the start of a section) and -1
(to indicate the end of a section) to let IPM distinguish between the different code sections.
The only problem is that the OpenFOAM solvers do not link to MPI directly, since the MPI implementation is hidden inside the Pstream
class. Therefore, we must edit the dependencies of our newly created solver. This is done by adding two lines in the top of the Make/options
file:Make/options
sinclude $(GENERAL_RULES)/mplib$(WM_MPLIB) sinclude $(RULES)/mplib$(WM_MPLIB) EXE_INC = \ -I$(LIB_SRC)/turbulenceModels/incompressible/turbulenceModel \ -I$(LIB_SRC)/transportModels \ -I$(LIB_SRC)/transportModels/incompressible/singlePhaseTransportModel \ -I$(LIB_SRC)/finiteVolume/lnInclude EXE_LIBS = \ -lincompressibleTurbulenceModel \ -lincompressibleRASModels \ -lincompressibleLESModels \ -lincompressibleTransportModels \ -lfiniteVolume \ -lmeshTools \ -L$(IPM_LIBPATH) \ -lipm |
Then the time is come to edit the solver source pisoFoamIPM.C
itself. The first thing you need to do is to include #include "mpi.h"
at the top of the file. Then you can insert MPI_Pcontrol
calls as it suits you. This is the results of our editing (you might of course create different regions):pisoFoamIPM.C
#include "mpi.h" #include "fvCFD.H" #include "singlePhaseTransportModel.H" #include "turbulenceModel.H" // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // int main( int argc, char *argv[]) { #include "setRootCase.H" MPI_Pcontrol(1, "startup" ); #include "createTime.H" #include "createMesh.H" #include "createFields.H" #include "initContinuityErrs.H" MPI_Pcontrol(-1, "startup" ); // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Info<< "\nStarting time loop\n" << endl; while (runTime.loop()) { Info<< "Time = " << runTime.timeName() << nl << endl; #include "readPISOControls.H" #include "CourantNo.H" // Pressure-velocity PISO corrector { // Momentum predictor MPI_Pcontrol(1, "momentumPredictor" ); fvVectorMatrix UEqn ( fvm::ddt(U) + fvm:: div (phi, U) + turbulence->divDevReff(U) ); UEqn.relax(); if (momentumPredictor) { solve(UEqn == -fvc::grad(p)); } MPI_Pcontrol(-1, "momentumPredictor" ); // --- PISO loop for ( int corr=0; corr<nCorr; corr++) { MPI_Pcontrol(1, "fluxCalc" ); volScalarField rAU(1.0/UEqn.A()); U = rAU*UEqn.H(); phi = (fvc::interpolate(U) & mesh.Sf()) + fvc::ddtPhiCorr(rAU, U, phi); adjustPhi(phi, U, p); MPI_Pcontrol(-1, "fluxCalc" ); // Non-orthogonal pressure corrector loop MPI_Pcontrol(1, "pressureCorrector" ); for ( int nonOrth=0; nonOrth<=nNonOrthCorr; nonOrth++) { // Pressure corrector fvScalarMatrix pEqn ( fvm::laplacian(rAU, p) == fvc:: div (phi) ); pEqn.setReference(pRefCell, pRefValue); if ( corr == nCorr-1 && nonOrth == nNonOrthCorr ) { pEqn.solve(mesh.solver( "pFinal" )); } else { pEqn.solve(); } if (nonOrth == nNonOrthCorr) { phi -= pEqn.flux(); } } MPI_Pcontrol(-1, "pressureCorrector" ); MPI_Pcontrol(1, "contErrors" ); #include "continuityErrs.H" MPI_Pcontrol(-1, "contErrors" ); MPI_Pcontrol(1, "momentumCorrector" ); U -= rAU*fvc::grad(p); U.correctBoundaryConditions(); MPI_Pcontrol(-1, "momentumCorrector" ); } } MPI_Pcontrol(1, "turbulenceModel" ); turbulence->correct(); MPI_Pcontrol(-1, "turbulenceModel" ); runTime.write(); Info<< "ExecutionTime = " << runTime.elapsedCpuTime() << " s" << " ClockTime = " << runTime.elapsedClockTime() << " s" << nl << endl; } Info<< "End\n" << endl; return 0; } |
After this is finished, re-compile the solver with the wmake
command.
When you now run the pisoFoamIPM
application, we will get information about how long time the solver spent inside the individual regions, i.e. how long time it used to solve the momentum equations, pressure correctors and execute the turbulence model. Communication and MPI statistics is also grouped and displayed on a per-region basis as well as general statistics for the entire solver. Note that there are pieces of code in our solver that is not part of any explicit region, and they are handled together as if they were a separate region (called ipm_noregion
).