Debugging

Compiler Debug Options

The table below shows a list of debugging options for the Intel compilers.

Option Description
-O0 Disables all optimizations. The default optimization level is -O2
-g This will generate symbolic debugging information in the object file
-traceback Tells the compiler to generate extra information in the object file to provide source file traceback information when a severe error occurs at run time
-check bounds Enables compile-time and run-time checking for array subscript and character substring expressions (Fortran only)
-check uninit Enables run-time checking for uninitialized local scalar variables without the SAVE attribute (Fortran only)
-check-uninit Enables run-time checking for uninitialized local scalar variables (C/C++ only)
-fpe0 Trapping of floating-point invalid, divide-by-zero, and overflow exceptions are enabled (Fortran only)

GNU GDB

GDB, the GNU Project debugger, is a free software debugger that supports several programming languages including C, C++ and Fortran. GDB has a command-line interface and do not contain its own graphical user interface (GUI).

GDB Commands

To begin a debug session compile the code with the -g option to add debugging information, and start GDB by running the gdb command adding the executable program as argument:

$ gdb prog

Once inside the GDB environment, indicated by the (gdb) prompt, you can issue commands. The following shows a list of selected GDB commands:

  • help

display a list of named classes of commands

  • run

start the program

  • attach

attach to a running process outside GDB

  • step

go to the next source line, will step into a function/subroutine

  • next

go to the next source line, function/subroutine calls are executed without stepping into them
  • continue

continue executing
  • break

set breakpoint
  • watch

set a watchpoint to stop execution when the value of a variable or an expression changes
  • list

display (default 10) lines of source surrounding the current line
  • print

print value of a variable
  • backtrace

display a stack frame for each active subroutine
  • detach

detach from a process
  • quit

exit GDB

Commands can be abbreviated to one or the first few letters of the command name if that abbreviation is unambiguous or in some cases where a single letter is specifically defined for a command. E.g. to start a program:

(gdb) r
Starting program: /path/to/executable/prog

To execute shell commands during the debugging session issue shell in front of the command, e.g.

(gdb) shell ls -l

Attaching to running processes

GDB can attach to already running processes using the attach [process-id] command. After attaching to a process GDB will stop it from running. This allows you to prepare the debug session using GDB commands, e.g. setting breakpoints or watchpoints. Then use the continue command to let the process continue running.

Although GDB is a serial debugger you can examine parallel programs by attaching to individual processes of the program. For instance, when running batch jobs you can log into one of the compute nodes of the job and attach to one of the running processes.

The listing below displays a sample debug session attaching to one of the processes of a running MPI job for examining data (lines starting with # are comments):

$ gdb

(gdb) # List the processes of the MPI program
(gdb) shell ps -eo pid,comm | grep mpi_prog
14957   mpi_prog
14961   mpi_prog
14962   mpi_prog
...etc.

(gdb) # Attach to one of the MPI processes
(gdb) attach 14961
Attaching to process 14961
Reading symbols from /path/to/executable/mpi_prog...done.
...etc

(gdb) # Set a watchpoint to stop execution when the variable Uc is updated
(gdb) watch Uc
Hardware watchpoint 1: Uc

(gdb) # Continue the execution of the program
(gdb) continue
Continuing.

Hardware watchpoint 1: Uc
Old value = -3.33545399
New value = -2.11184907
POTTEMP::ptemp (ldiad=...etc) at ptemp1.f90:298
298              Vc= dsdx(2,1,ie2)*u0 + dsdx(2,2,ie2)*v0 + dsdx(2,3,ie2)*w0

(gdb) # Set the list command to display 16 lines...
(gdb) set listsize 16
(gdb) # ...and display the source backwards starting 2 lines below the current one
(gdb) list +2
284              do k= 1, 8
285                kp= lnode2(k,ie2)   
286                u0= u0 + u12(kp)
287                v0= v0 + u22(kp)
288                w0= w0 + u32(kp)
289                vt= vt + vtef2(kp)
290              enddo
291
292              u0= 0.125*u0;  v0= 0.125*v0;  w0= 0.125*w0;  vt= 0.125*vt
293
294     !
295     !----    Contravariant velocity  
296     !
297              Uc= dsdx(1,1,ie2)*u0 + dsdx(1,2,ie2)*v0 + dsdx(1,3,ie2)*w0
298              Vc= dsdx(2,1,ie2)*u0 + dsdx(2,2,ie2)*v0 + dsdx(2,3,ie2)*w0
299              Wc= dsdx(3,1,ie2)*u0 + dsdx(3,2,ie2)*v0 + dsdx(3,3,ie2)*w0

(gdb) # Print a 5 element slice of the variable u12
(gdb) print u12(3006:3010)
$1 = (0.0186802763, 0.0188683271, 0.0145201795, 0.00553302653, -0.00918145757)

(gdb) # Release the process from GDB control
(gdb) detach
Detaching from program: /path/to/executable/mpi_prog, process 14961

(gdb) quit

Examining Core Files

Core files can be examined specifying both an executable program and the core file:

$ gdb prog core

One can also produce a core file from within the GDB session to preserve a snapshot of a program’s state using the command:

(gdb) generate-core-file
Further Information

Totalview

Totalview is a GUI-based source code debugger from Rogue Wave Software, Inc. It allows for debugging of serial and parallel codes. Program execution is controlled by stepping line by line through the code, setting breakpoints, or by setting watchpoints on variables. It is also efficient for debugging of memory errors and leaks and diagnosis problems like deadlocks.

Totalview comes with the ReplayEngine tool embedded. When debugging a program crash ReplayEngine lets you start from the point of failure and work backward in time to find the error that caused it. Notice, ReplayEngine increases the amount of memory your program uses as it keeps history and state information in memory.

TotalView works with C, C++ and Fortran applications, and supports OpenMP and several MPI implementations including SGI MPT, MVAPICH2 and Open MPI.

Starting Totalview

After compiling your MPI code with the -g flag, start Totalview with your executable, e.g. mpi_prog, by issuing the command:

$ totalview mpi_prog

Three windows, the TotalView Root window, Startup Parameters Dialog Box and the Process Windows, will appear. The Startup Parameters Dialog Box allows you to specify arguments (command line and environment variables), standard I/O files, MPI implementation and number of MPI tasks, and enabling of ReplayEngine and memory debugging.

Figure 1. Totalview Startup Parameters Dialog Box
(click to enlarge)

Figure 2. Totalview Process Window
(click to enlarge)

Choose MPT from the list of Parallel system implementations, and the number of tasks for the debug session. Click OK to allow Totalview to load your program into the Process Window.

You are now ready to start a debugging session doing different actions, e.g.:

  • Click the Step or Next buttons to go through the code statement by statement. For function calls Step goes into the function while Next executes the function.
  • Create a breakpoint by clicking the line number displayed to the left in the Process Window. Click the Go button to run to this line.
  • Monitor a variable’s value by creating a watchpoint, select Action Points → Create Watchpoint. A watchpoint stops execution when a variable’s data changes.
  • Examine variables. Dive into a variable by clicking View → Lookup Variable or double-click the variable using your left mouse button. The Variable Window appears.
  • Visualize variables across processes by diving into a variable and click View → Show Across → Processes in the Variable Window.
  • Examine array data. Dive into an array variable. Display array subsections by editing the Slice field in an array’s Variable Window. Show statistics information about the array (or slice of array) by clicking Tools → Statistics in the Variable Window.

Figure 3. Examining Data
(click to enlarge)

Examining Core Files

If a program encounters a serious error and a process dumps a core file you can look at this file starting Totalview with:

$ totalview mpi_prog core

The Process Window displays the core file, with the Stack Trace, Stack Frame, and Source Panes showing the state of the process when it dumped core. The title bar of the Process Window names the signal that caused the core dump. The state of all variables at the time the error occurred can be examined.

Memory Debugging

TotalView has memory debugging facilities that work objects created in heap memory, but not with objects created on the program stack. Watch Rogue Wave’s memory debugging videos to learn how to use them.

Interactive Batch System Debugging

When running Totalview in the batch system, first start an interactive batch job session, e.g.

$ qsub -I -X -A <my account> -l select=2:ncpus=32:mpiprocs=16 -l walltime=02:00:00

Load the appropriate compiler, MPT and Totalview modules, e.g.:

$ module load intelcomp/13.0.1 mpt/2.06 totalview/8.12.0-1

Then start Totalview adding the -tv option to mpiexec_mpt:

$ mpiexec_mpt -tv ./a.out

Your program will now execute within Totalview on the number of nodes specified when submitting the job.

Further Information

DDT

We have also Allinea’s DDT debugger available. This is, however, a local offer, so if you prefer to learn to use a debugger that is also available at other Notur sites, then you should choose the TotalView debugger instead.

DDT is a scalable, graphical debugging tool for scalar, multi-threaded and large-scale MPI applications. DDT has included a memory debugging feature that is able to detect some errors before they have caused a program crash and it can show the current state of communication between processes in a program.

DDT supports C, C++, Fortran, OpenMP, CUDA, UPC, Fortran 2008 Coarrays and many MPI implementations, including SGI MPT.

Starting DDT

After compiling your program with -g, start DDT by issuing a command like this:

$ module load intelcomp/16.0.1 mpt/2.13 python/2.7.11 forge/6.0.3
$ ddt mpi_prog

Two windows will appear: DDT’s Main Window and the Run Dialog Box. The Run Dialog Box allows you to specify program arguments, input file, working directory, MPI implementation, number of processes, OpenMP support, CUDA support, support for memory debugging and environment variables. Confirm that SGI MPT is selected as the MPI implementation, and enter the number of processes for the debug session. Press the Run button to start debugging your program.

Figure 4. DDT Starup Window
(Click to enlarge)

Figure 5. DDT Main Window
(Click to enlarge)

DDT will show your code in the Main Window. The most used buttons on the toolbar with shortcut keys, from left to right, are:

  • Play/Continue (F9): Start/continue your program.
  • Pause (F10): Pause your program.
  • Add Breakpoint: Press this button and add the line number. Notice, breakpoints can also be added by double-clicking a line.
  • Step Into (F5), Step Over (F8), Step Out (F6): Step into, over, or out of function calls.
  • Run To Line: Press the button and enter the line number you want to run to.
  • Down Stack Frame (Ctrl-D), Up Stack Frame (Ctrl-U), Bottom Stack Frame (Ctrl-B): After your program has stopped inside of some function, you can use buttons to step up and down in the program’s call stack.

The bottom panel has a number of pages that can be selected through the tabs on the top of them. There, you can:

  • Watch the output that the program prints to standard output.
  • Watch the breakpoints that you have set.
  • Monitor variables’ value by creating watchpoints. A watchpoint stops execution, when a variable’s data changes.
  • Examine the current call stack. Clicking on a line in the call stack will make DDT show the corresponding program code. The number to the left of each line is the number of processes that are in that place in the code.
  • Examine and set tracepoints. Tracepoints allow you to see what lines of code that your program is executing without stopping it. DDT prints the file and line number, when a tracepoint is reached. You can also capture the value of any variable at that point.

In the right window panel one can examine:

  • All the local variables in the current function/subroutine.
  • The variables and their current values that are used in the current line(s). A group of lines can be selected.
  • The current call stack with the value of each argument for each function call.

If you right-click a variable in the code, a menu appears showing the type of the variable and for scalars also the value. For arrays select the View Array menu option. A window will appear showing the content of the array, or if selecting a range, showing a slice of an array.

Figure 6. Examining Data
(Click to enlarge)

Examining Core Files

DDT allows you to examine core files generated by a program. Start DDT without arguments and select Open Core Files on the Welcome Screen. This allows for selecting an executable and a core file to debug.

Further Information