Vampir

Introduction

Performance optimization is a key issue for the development of efficient parallel software applications. Vampir provides a manageable framework for analysis, which enables developers to quickly display program behavior at any level of detail. Detailed performance data obtained from a parallel program execution can be analyzed with a collection of different performance views. Intuitive navigation and zooming are the key features of the tool, which help to quickly identify inefficient or faulty parts of a program code. Vampir implements optimized event analysis algorithms and customizable displays which enable a fast and interactive rendering of very complex performance monitoring data. Ultra-large data volumes can be analyzed with a parallel version of Vampir, which is available on request. Vampir has a product history of more than 15 years and is well established on Unix-based HPC systems. This tool experience is now available for HPC systems that are based on Microsoft Windows HPC Server 2008. This new Windows edition of Vampir combines modern scalable event processing techniques with a fully redesigned graphical user interface.

Vampir on FutureGrid

VampirServer is currently available on India at /N/soft/x86_64/el5/india/vampirserver. The VampirTrace modules are installed on Alamo, Hotel, India, and Sierra. To load, type 'module load vampirtrace'.

Event-based Performance Tracing and Profiling

In software analysis, the term profiling refers to the creation of tables which summarize the runtime behavior of programs by means of accumulated performance measurements. Its simplest variant lists all program functions in combination with the number of invocations and the time that was consumed. This type of profiling is also called inclusive profiling, as the time spent in subroutines is included in the statistics computation. A commonly applied method for analyzing details of parallel program runs is to record so-called trace log files during runtime. The data collection process itself is also referred to as tracing a program. Unlike profiling, the tracing approach records timed application events like function calls and message communication as a combination of timestamp, event type, and event specific data. This creates a stream of events, which allows very detailed observations of parallel programs. With this technology, synchronization and communication patterns of parallel program runs can be traced and analyzed in terms of performance and correctness. The analysis is usually carried out in a postmortem step, i. e., after completion of the program. Needless to say, program traces can also be used to calculate the profiles mentioned above. Computing profiles from trace data allows arbitrary time intervals and process groups to be specified. This is in contrast to fixed profiles accumulated during runtime.

The Open Trace Format (OTF)

The Open Trace Format (OTF) was designed as a well-defined trace format with open, public domain libraries for writing and reading. This open specification of the trace information provides analysis and visualization tools like Vampir to operate efficiently at large scale. The format addresses large applications written in an arbitrary combination of Fortran77, Fortran (90/95/etc.), C, and C++.

Representation of Streams by Multiple Files

OTF uses a special ASCII data representation to encode its data items with numbers and tokens in hexadecimal code without special prefixes. That enables a very powerful format with respect to storage size, human readability, and search capabilities on timed event records. In order to support fast and selective access to large amounts of performance trace data, OTF is based on a stream-model, i.e., single separate units representing segments of the overall data. OTF streams may contain multiple independent processes, whereas a process belongs to a single stream exclusively. As shown in the figure, each stream is represented by multiple files, which store definition records, performance events, status information, and event summaries separately. A single global master file holds the necessary information for the process to stream mappings. Each file name starts with an arbitrary common prefix defined by the user. The master file is always named {name}.otf. The global definition file is named {name}.0.def. Events and local definitions are placed in files {name}.x.events and {name}.x.defs, where the latter files are optional. Snapshots and statistics are placed in files named {name}.x.snaps and {name}.x.stats, which are also optional.

Note: Open the master file (*.otf) to load a trace. When copying, moving, or deleting traces, it is important to take all according files into account; otherwise, Vampir will render the whole trace invalid! Good practice is to hold all files belonging to one trace in a dedicated directory. Detailed information about the Open Trace Format can be found in the Open Trace Format (OTF) documentation.

Getting Started

Generation of Trace Data

The generation of trace files for the (Vampir) performance visualization tool requires a working monitoring system to be attached to your parallel program. Contrary to Windows HPC Server 2008 — whereby the performance monitor is integrated into the operating system — recording performance under Linux is done by a separate performance monitor. We recommend our VampirTrace monitoring facility, which is available as open source software. During a program run of an application, VampirTrace generates an OTF trace file, which can be analyzed and visualized by Vampir. The VampirTrace library allows MPI communication events of a parallel program to be recorded in a trace file. Additionally, certain program-specific events can also be included. To record MPI communication events, simply relink the program with the VampirTrace library. A new compilation of the program source code is only necessary if program-specific events should be added. Detailed information on the installation and usage of VampirTrace can be found at VampirTrace.

Enabling Performance Tracing

To perform measurements with VampirTrace, the application program needs to be instrumented. VampirTrace handles this automatically by default, while manual instrumentation is also possible. All the necessary instrumentation of user functions, MPI, and OpenMP events is handled by the compiler wrappers of VampirTrace (vtcc, vtcxx, vtf77, vtf90). All compile and link commands in the used makefile should be replaced by the VampirTrace compiler wrapper, which performs the necessary instrumentation of the program and links the suitable VampirTrace library. Automatic instrumentation is the most convenient method to instrument your program. Therefore, simply use the compiler wrappers without any parameters, e.g.:

vtf90 hello.f90 -o hello

For manual instrumentation with the VampirTrace API, simply include:

vt_user.inc (Fortran)

vt_user.h (C, C++)

and label any user defined sequence of statements for instrumentation as follows:

VT_USER_START(name) ... VT_USER_END(name)

in Fortran and C, respectively, and in C++ as follows:

VT_TRACER(``name);

Afterwards, use

vtcc -DVTRACE hello.c -o hello

to combine the manual instrumentation with automatic compiler instrumentation or

vtcc -vt:inst manual -DVTRACE hello.c -o hello

to prevent an additional compiler instrumentation.

Tracing an Application

Running a VampirTrace instrumented application should normally result in an OTF trace file in the current working directory where the application was executed. On Linux, Mac OS, and Sun Solaris, the default name of the trace file will be equal to the application name. For other systems, the default name is a.otf but can be defined manually by setting the environment variable VT_FILE_PREFIX to the desired name. After a run of an instrumented application, the traces of the single processes need to be unified in terms of timestamps and event IDs. In most cases, this happens automatically. If it is necessary to perform unification of local traces manually, use the following command:

vtunify <nproc>

If VampirTrace was built with support for OpenMP and/or MPI, it is possible to speed up the unification of local traces significantly. To distribute the unification on multiple processes, the MPI parallel version vtunify-mpi can be used as follows:

mpirun -np <nranks> vtunify-mpi <nproc>

Starting Vampir and Loading a Trace File

To open a trace file, from the "File" menu, select "Open...". This will provide the file-open dialog depicted below. It is possible to filter the files in the list. The file type input selector determines the visible files. The default "OTF Trace Files (*.otf )" shows only files that can be processed by the tool. All file types can be displayed by using "All Files (*)". Alternatively, on Windows, a command-line invocation is possible:

C:\Program Files\Vampir\Vampir.exe [trace file]

To open multiple trace files at once, you can take them one after another as command-line arguments:

C:\Program Files\Vampir\Vampir.exe [file 1]...[file n]

It is also possible to start the application by double-clicking on an *.otf file (if Vampir was associated with *.otf files during the installation process). The trace files to be loaded have to be compliant with the Open Trace Format (OTF) standard. Microsoft HPC Server 2008 is shipped with the translator program etl2otf.exe, which produces appropriate input files.

Loading a Trace Log File in Vampir

While Vampir is loading the trace file, an empty "Trace View" window with a progress bar at the bottom opens. After Vampir loaded the trace data completely, a default set of charts will appear. The illustrated loading process can be interrupted at any point of time by clicking on the cancel button in the lower right corner. Because events in the trace file are traversed one after another, the GUI will also open, but will show only the earliest information from the tracefile. For huge tracefiles with performance problems assumed to be at the beginning, this proceeding is a suitable strategy to save time.

Progress Bar and Cancel Loading Button

Basic functionality and navigation elements are described in Basics. The available charts and the information provided by them are explained in Performance_Data_Visualization.

Basics

After loading has been completed, the Trace View window title displays the trace file's name. By default, the Charts toolbar and the Zoom Toolbar are available.

Trace View Window with Charts Toolbar (A) and Zoom Toolbar (B)

Furthermore, the default set of charts is opened automatically after loading has been finished. The charts can be divided into three groups: timeline, statistical, and informational charts. Timeline charts show detailed event-based information for arbitrary time intervals, while statistical charts reveal accumulated measures computed from the corresponding event data. Informational charts provide additional or explanatory information regarding timeline and statistical charts. All available charts can be opened with the Charts toolbar (explained in The Charts Toolbar). In the following section, we will explain the basic functions of the Vampir GUI which are generic to all charts.

Chart Arrangement

The utility of charts can be increased by correlating them and their provided information. Vampir supports this mode of operation by allowing you to display multiple charts at the same time. Charts that display a sequence of events such as the Master Timeline and the Process Timeline chart are aligned vertically. This alignment ensures that the temporal relationship of events is preserved across chart boundaries. The user can arrange the placement of the charts according to his preferences by dragging them into the desired position. When the left mouse button is pressed while the mouse pointer is located above a placement decoration, the layout engine will give visual clues as to where the chart may be moved. As soon as the user releases the left mouse button, the chart arrangement will be changed according to his intentions. The entire procedure is depicted in figures below. The flexible display architecture also allows increasing or decreasing the screen space that is used by a chart. Charts of particular interest may get more space in order to render information in more detail.

Moving and Arranging Charts in the Trace View Window

A Custom Chart Arrangement in the Trace View Window

Closing (right) and Undocking (left) a Chart

The Trace View window can host an arbitrary number of charts. Charts can be added by clicking on the respective Charts toolbar icon or the corresponding Chart menu entry. With a few more clicks, charts can be combined to a custom chart arrangement. Customized layouts can be saved as described in Saving Policy. Every chart can be undocked or closed by clicking the dedicated icon in its upper right corner. Undocking a chart means to free the chart from the current arrangement and present it in its own window.

Undocking of a Chart

Docking a Chart

Considering that labels (e.g., those showing names or values of functions) often need more space to show their whole text, there is a further form of resizing/arranging. In order to read labels completely, it is possible to resize the distribution of space owned by the labels and the graphical representation in a chart. When one hovers over the blank space between labels and graphical representation, a moveable separator appears. After clicking a separator decoration, moving the mouse while holding the left mouse button causes resizing.

Resizing Labels: (A) Hover over a Separator Decoration; (B) Drag and Drop the Separator

Context Menus

All of the chart displays have their own context menus with common entries as well as display-specific ones. In the following section, only the most common entries will be discussed. A context menu can be accessed by right clicking in the display window. Common entries are:

Reset Zoom: Go back to the initial state in horizontal zooming.
Reset Vertical Zoom: Go back to the initial state in vertical zooming.
Set Metric: Change values which should be represented in the chart, e.g. Exclusive Time to Inclusive Time.
Sort By: Rearrange values or bars by a certain characteristic.

Zooming

Zooming is a key feature of Vampir. In most charts it is possible to zoom in and out to get abstract and detailed views of the visualized data. In the timeline charts, zooming produces a more detailed view of a special time interval and therefore reveals new information that could not be seen in the larger section. Short function calls in the Master Timeline may not be visible unless an appropriate zooming level has been reached. If the execution time of these short functions is too short regarding the pixel resolution of your computer display, the selection of a shorter time interval is required. Note: Other charts can be affected when zooming in timeline displays: The interval chosen in a timeline chart such as Master Timeline or Process Timeline also defines the time interval for the calculation of accumulated measurements in the statistical charts. Statistical charts like the Function Summary provide zooming of statistic values. In these cases zooming does not affect any other chart. Zooming is disabled in the Pie Chart mode of the Function Summary reachable via context menu under Set Chart Mode->Pie Chart.

Zooming within a Chart

To zoom into an area, click and hold the left mouse button and select the area. It is possible to zoom horizontally and in some charts also vertically. Horizontal zooming in the Master Timeline defines the time interval to be visualized whereas vertical zooming selects a group of processes to be displayed. To scroll horizontally move the slider at the bottom or use the mouse wheel. Additionally, the zoom can be accessed with help of the Zoom Toolbar by dragging the borders of the selection rectangle or scrolling down the mouse wheel. To return to the previous zooming state, the global "Undo" is provided that in the "Edit" menu; alternatively, press "Ctrl+Z" to revert to the last zoom. Accordingly, a zooming action can be repeated by selecting "Redo" in the "Edit" menu or pressing "Ctrl+Shift+Z". Both functions work independently of the current mouse position. Next to "Undo" and "Redo" it is shown which kind of action in which display could be undone and redone, respectively. To get back to the initial state of zooming in a fast way select Reset Horizontal Zoom or Reset Vertical Zoom in the context menu of the desired timeline display. To reset zoom is also an action that can be reverted by "Undo".

The Zoom Toolbar

Vampir provides a Zoom Toolbar that can be used for zooming and navigation in the trace data. It is situated in the upper right corner of the Trace View window. Of course it is possible to drag and drop it as desired. The Zoom Toolbar offers an overview of the data displayed in the corresponding charts. The current zoomed area can be seen highlighted as a rectangle within the Zoom Toolbar. Clicking on one of the two boundaries and moving it (with left mouse button held) to the intended position executes horizontal zooming in all charts.
Note: Instead of dragging boundaries, it is also possible to use the mouse wheel for zooming. Hover over the Zoom Toolbar and scroll up to zoom in and scroll down to zoom out. Dragging the zoom area changes the section that is displayed without changing the zoom factor. For dragging, click in the highlighted zoom area and drag and drop it to the desired region. If the user double clicks in the Zoom Toolbar, the initial zooming state is reverted to.

Zooming and Navigation within the Zoom Toolbar: (A+B) Zooming in/out with Mouse Wheel; (C) Scrolling by Moving the Highlighted Zoom Area; (D) Zooming by Selecting and Moving a Boundary of the Highlighted Zoom Area

The colors represent user-defined groups of functions or activities. Please note that all charts added to the Trace View window will adapt their statistics information according to this time interval selection. The Zoom Toolbar can be disabled and enabled with the toolbar's context menu entry Zoom Toolbar.

The Charts Toolbar

Use the Charts toolbar to open instances of the different charts. It is situated in the upper left corner of the main window by default. Of course, it is also possible to drag and drop it as desired. The Charts toolbar can be disabled with the toolbar's context menu entry Charts. The table below shows the different icons representing the charts in Charts toolbar. The icons are arranged in three groups, divided by a small separator. The first group represents timeline charts, whose zooming states affect all other charts. The second group consists of statistical charts, providing special information and statistics for a chosen interval. Vampir allows multiple instances for charts of these categories. The last group comprises informational charts, providing specific textual information or legends. Only one instance of an informational chart can be opened at a time.

Icons of the Toolbar

Icon	Name	Description
	Master Timeline	Master Timeline
	Process Timeline	Process Timeline
	Counter Data Timeline	Counter Data
	Performance Radar	Performance Radar
	Function Summary	Function Summary
	Message Summary	Message Summary
	Process Summary	Process Summary
	Communication Matrix View	Communication Matrix View
	Call Tree	Call Tree
	Function Legend	Function Legend
	Context View	Context View
	Marker View	Marker View

Properties of the Tracefile

Vampir provides a display containing the most important characterizations of the used tracefile. This tabular is called Trace Properties and can be accessed by File->Trace Properties. The information, such as the filename, the creator and its version, originates from the tracefile and is not changed by Vampir.

Performance Data Visualization

This chapter deals with the different charts that can be used to analyze the behavior of a program and the comparison between different function groups, e.g., MPI and Calculation. In addition, the chapter addresses communication performance issues. Various charts address the visualization of data transfers between processes. The following sections describe them in detail.

Timeline Charts

A very common chart type used in event-based performance analysis is the so-called timeline chart. This chart type graphically presents the chain of events of monitored processes or counters on a horizontal time axis. Multiple timeline chart instances can be added to the Trace View window via the Chart menu or the Charts toolbar.

Note: To measure the duration between two events in a timeline chart, Vampir provides a tool called ruler. Click on the first event in a timeline display and move the mouse while keeping the left mouse key and Shift pressed. A ruler-like pattern appears in the current timeline chart, which provides rough measurement directly. The exact time of the start event and the mouse position and the interval in between is given at the very bottom. If the Shift key is released before the left mouse key, Vampir will proceed with zooming.

Master Timeline and Process Timeline

In the Master and Process Timelines, detailed information about functions, communication, and synchronization events is shown. Timeline charts are available for individual processes (Process Timeline) as well as for a collection of processes (Master Timeline). The Master Timeline consists of a collection of rows. Each row represents a single process, as shown in the figure below. A Process Timeline shows the different levels of function calls in a stacked bar chart for a single process, as depicted in the second figure.

Master Timeline

Process Timeline

Every timeline row consists of a process name on the left and a colored sequence of function calls or program phases on the right. The color of a function is defined by its group membership; e.g., MPI_Send() belonging to the function group MPI has the same color, presumably red, as MPI_Recv(), which also belongs to the function group MPI. Clicking on a function highlights it and causes the Context View display to show detailed information about that particular function, e.g., its corresponding function group name, time interval, and the complete name. The Context View display is explained in its own section below. Some function invocations are very short, and will not show up in the overall view because of a lack of display pixels. A zooming mechanism is provided to inspect a specific time interval in more detail. If zooming is performed, panning in a horizontal direction is possible with the scroll bar at the bottom. The Process Timeline resembles the Master Timeline with slight differences. The chart's timeline is divided into levels, which represent the different call stack levels of function calls. The initial function begins at the first level, a sub-function called by that function is located a level beneath, and so forth. If a sub-function returns to its caller, the graphical representation also returns to the level above. In addition to the display of categorized function invocations, Vampir's Master and Process Timeline also provide information about communication events. Messages exchanged between two different processes are depicted as black lines. In timeline charts, the progress in time is reproduced from left to right. The leftmost starting point of a message line and its underlying process bar therefore identify the sender of the message, whereas the rightmost position of the same line represents the receiver of the message. The corresponding function calls normally reflect a pair of MPI communication directives like MPI_Send() and MPI_Recv(). It is also possible to show a collective communication like MPI_Allreduce() by selecting one corresponding message as shown in the figure.

Selected MPI Collective in Master Timeline

Additional information like message bursts, markers, and I/O events is also available. The table shows the symbols and descriptions of these objects.

Additional Information in Master and Process Timeline

Symbol	Description
Message Burst	Because of a lack of pixels it is not possible to display a large number of messages in a very short interval. Therefore, these messages are summarized as so-called message bursts. Zooming into this interval reveals the corresponding single messages.
Markers multiple single	To indicate particular points (like errors or warnings) during the runtime of an application, markers can be used in a tracefile. They are drawn as triangles, which are colored according to their types. To illustrate that two or more markers are placed at the same pixel, a multiple marker is drawn.
I/O Events multiple single single, selected	Vampir shows detailed information about I/O operations, if they are included in the tracefile. I/O events are depicted as triangles at the beginning of an I/O interval. Multiple I/O events are tricolored and occupy a line to the end of the interval. To see the whole interval of a single I/O event, the triangle has to be selected. In that case, a second triangle at the end of the interval appears. Since the Process Timeline reveals information of one process only, short black arrows are used to indicate outgoing communication. Clicking on message lines or arrows shows message details like sender process, receiver process, message length, message duration, and message tag in the Context View display.

Counter Data Timeline

Counters are values collected over time to count certain events like floating point operations or cache misses. Counter values can be used to store not just hardware performance counters but arbitrary sample values. There can be counters for different statistical information as well, for instance, counting the number of function calls or a value in an iterative approximation of the final result. Counters are defined during the instrumentation of the application and can be individually assigned to processes.

Counter Data Timeline

The chart is restricted to one counter at a time. It shows the selected counter for one process. Using multiple instances of the Counter Data Timeline, counters or processes can be compared easily. The context menu entry Set Counter allows you to choose the displayed counter directly from a drop-down list. The entry Set Process selects the particular process for which the counter is shown.

Performance Radar

The Performance Radar chart provides the search of function occurrences in the trace file and the extended visualization of counters. It can happen that a function is not shown in Master and Process Timeline due to a short runtime. An alternative to zooming is the option Find Function.... A color-coded timeline indicates the intervals in which the function is executed.

Performance Radar Timeline - Search of Functions

By default, the Performance Radar shows the values of one counter for each process (thread). In this mode the user can choose between Line Plot and Color Coded drawing. In the latter case, a color scale on the bottom provides information about the range of values. Clicking on Set Counter... leads to a dialog that offers the option of choosing another counter and calculating the sum or average values. Summarizing means that the values of the selected counter of all processes are summed up. The average is this sum divided by the number of processes. Both options provide a single graph.

Performance Radar Timeline - Visualization of Counters

Statistical Charts

Call Tree

The Call Tree illustrates the invocation hierarchy of all monitored functions in a tree representation. The display reveals information about the number of invocations of a given function, the time spent in the different calls, and the caller-callee relationship.

Call Tree

The entries of the Call Tree can be sorted in various ways. Simply click on one header of the tree representation to use its characteristic to resort the Call Tree. Please note that not all available characteristics are enabled by default. To add or remove characteristics, a context menu is accessible by right-clicking on any of the tree headers. To leaf through the different function calls, it is possible to fold and unfold the levels of the tree. This can be achieved by double-clicking a level, or by using the fold level buttons next to the function name. Functions can be called by many different caller functions, which is hardly obvious in the tree representation. Therefore, a relation view shows all callers and callees of the currently selected function in two separated lists, as shown in the lower area. To find a certain function by its name, Vampir provides a search option accessible with the context menu entry Show Find View. The entered keyword has to be confirmed by pressing the Return key. The Previous and Next buttons can be used to flip through the results afterwards.

Function Summary

The Function Summary chart gives an overview of the accumulated time consumption across all function groups and functions. For example every time a process calls the MPI_Send() function, the elapsed time of that function is added to the MPI function group time. The chart gives a condensed view on the execution of the application and a comparison between the different function groups can be made so that dominant function groups can be distinguished easily.

Function Summary

It is possible to change the information displayed via the context menu entry Set Metric, which offers values like Average Exclusive Time, Number of Invocations, Accumulated Inclusive Time and others. Note: Inclusive means the amount of time spent in a function and all of its subroutines. Exclusive means the amount of time just spent in this function. The context menu entry Set Event Category specifies whether either function groups or functions should be displayed in the chart. The functions own the color of their function group. It is possible to hide functions and function groups from the displayed information with the context menu entry Filter. To mark the function or function group to be filtered, click the associated label or color representation in the chart. Using the Process Filter allows you to restrict this view to a set of processes. As a result, only the consumed time of these processes is displayed for each function group or function. Instead of using the filter (which affects all other displays by hiding processes), it is possible to select a single process via Set Process in the context menu of the Function Summary. This does not have any effect on other timeline displays. The Function Summary can be shown as a Histogram (a bar chart, as in timeline charts) or as a Pie Chart. To switch between these representations, use the Set Chart Mode entry of the context menu. The shown functions or function groups can be sorted by name or value via the context menu option Sort By.

Process Summary

The Process Summary is similar to the Function Summary but shows the information for every process independently.

Process Summary

This is useful for analyzing the balance between processes to reveal bottlenecks. For instance, finding that one process spends a significantly high time performing the calculations could indicate an unbalanced distribution of work that can slow down the entire application. The context menu entry Set Event Category specifies whether either function groups or functions should be displayed in the chart. The functions own the color of their function group. The chart can calculate the analysis based on Exclusive Time or Inclusive Time. To change between these two modes, use the context menu entry Set Metric. It is possible to hide functions and function groups from the displayed information with the context menu entry Filter. To mark the function or function group to be filtered, click on the associated color representation in the chart. Using the Process Filter allows you to restrict this view to a set of processes.

Message Summary

The Message Summary is a statistical chart showing an overview of the different messages grouped by certain characteristics.

Message Summary Chart with metric set to Message Transfer Rate showing the average transfer rate (A), and the minimal/maximal transfer rate (B)

All values are represented in a bar chart fashion. The number next to each bar is the group base, while the number inside a bar depicts the different values depending on the chosen metric. Therefore, the Set Metric sub-menu of the context menu can be used to switch between Aggregated Message Volume, Message Size, Number of Messages, and Message Transfer Rate. The group base can be changed via the context menu entry Group By. It is possible to choose between Message Size, Message Tag, and Communicator (MPI).

Note: There will be one bar for every occurring group. However, if metric is set to Message Transfer Rate, the minimal and the maximal transfer rate is given in an additional bar beneath the one showing the average transfer rate. The additional bar starts at the minimal rate and ends at the maximal one. To filter out messages, click on the associated label or color representation in the chart and choose Filter from the context menu afterwards.

Communication Matrix View

The Communication Matrix View is another way of analyzing communication imbalances. It shows information about messages sent between processes.

Communication Matrix View

The chart is realized as a table. Its rows represent the sending processes while its columns represent the receivers. The color legend on the right indicates the displayed values. Depending on the displayed information, the color legend changes. It is possible to change the type of displayed values. Different metrics like the average duration of messages passed from sender to recipient or minimum and maximum bandwidth are offered. To change the type of value that is displayed, use the context menu option Set Metric. Use the Process Filter to define which processes/groups should be displayed.

Note: A high duration is not automatically caused by a slow communication path between two processes, but can also be due to the fact that the time between starting transmission and successful reception of the message can be increased by a recipient that delays reception for some reason. This will cause the duration to increase (by this delay) and the message rate, which is the size of the message divided by the duration, to decrease accordingly.

Informational Charts

Function Legend

The Function Legend lists all visible function groups of the loaded trace file along with its corresponding color.

Function Legend

If colors of functions are changed, they appear in a tree-like fashion under their respective function group as well.

A chosen marker (A) and its representation in the Marker View (B)

The display is given in a tree-like fashion and organizes the marker events in their respective groups and types. Additional information, like the time of occurrence in the trace file and its description, is provided for each marker. By clicking on a marker event in the Marker View, this event gets selected in the timeline displays that are currently open, and vice-versa. If this marker event is not visible, the zooming area jumps to this event automatically. It is possible to select markers and types. Then all events belonging to that marker or type get selected in the Master Timeline and the Process Timeline. If Ctrl or Shift is pressed, the user can highlight several events. In this case, the user can fit the borders of the zooming area in the timeline charts to the timestamps of the two marker events that were chosen at last.

Context View

Context View, showing context information (B) of a selected function (A)

As implied by its name, the Context View provides more detailed information of a selected object compared to its graphical representation. An object, e.g., a function, function group, message, or message burst, can be selected directly in a chart by clicking its graphical representation. For different types of objects, different context information is provided by the Context View. For example, the object-specific information for functions holds properties like Interval Begin, Interval End, and Duration. The Context View may contain several tabs, and a new empty one can be added by clicking on the add-symbol on the right hand side. If an object in another chart is selected, its information is displayed in the current tab. If the Context View is closed, it opens automatically in that moment. The Context View offers a comparison between the information that is displayed in different tabs. Just use the = on the left hand side and choose two objects in the emerged dialog. It is possible to compare different elements from different charts, which can be useful in some cases. The comparison shows a list of common properties. The corresponding values are displayed, along with their difference if the values are numbers. The first line always shows the names of the displays.

Comparison between Context Information

Information Filtering and Reduction

Due to the large amount of information that can be stored in trace files, it is usually necessary to reduce the displayed information according to some filter criteria. In Vampir, there are different ways of filtering. It is possible to limit the displayed information to a certain choice of processes or to specific types of communication events, e.g., to certain types of messages or collective operations. Deselecting an item in a filter means that this item is fully masked. In Vampir, filters are global. Therefore, masked items will no longer show up in any chart. Filtering not only affects the different charts, but also the Zoom Toolbar. The different filters can be reached via the Filter entry in the main menu.

The example below shows a typical process representation in the Process Filter window. This kind of representation is equal to all other filters. Processes can be filtered by their Process Group, Communicators and Process Hierarchy. Items to be filtered are arranged in a spreadsheet representation. In addition to selecting or deselecting an entire group of processes, it is certainly possible to filter single processes.

Process Filter

Different selection methods can be used in a filter. The check box Include/Exclude All either selects or deselects every item. Specific items can be selected/deselected by clicking the check box next to it. Furthermore, it is possible to select/deselect multiple items at once; mark the desired entries by clicking their names while holding either the Shift or the Ctrl key. By holding the Shift key every item in between the two clicked items will be marked. Holding the Ctrl key, on the other hand, enables you to add or remove specific items from/to the marked ones. Clicking the check box of one of the marked entries will cause selection/deselection for all of them.

Options of Filtering

Filter Object	Filter Criteria
Processes	Process Groups
	Communicators
	Process Hierarchy
	Single Processes
Collective Operations	Communicators
	Collective Operations
Messages	Message Communicators
	Message Tags
I/O Events	I/O Groups
	Files
	Types

Customization

The appearance of the trace file and various other application settings can be altered in the preferences accessible via the main menu entry File->Preferences. Settings concerning the trace file itself, e.g., layout or function group colors, are saved individually next to the tracefile in a file, whose end is .vsettings. In this way, it is possible to adjust the colors for one trace file without interfering with other trace files. The options Import Preferences and Export Preferences provide the loading and saving of preferences of arbitrary tracefiles.

General Preferences

The General settings allow you to change application and trace specific values.

General Settings

Show time as decides whether the time format for the trace analysis is based on seconds or ticks. The next point Use color gradient in charts allows you to switch off the color gradient used in the performance charts. The next option is to change the style and size of the font. Show source code allows you to open an editor showing the respective source file. In order to open a source file, first click on the intended function in the Master Timeline and then on the source code path in the Context View. For the source code location to work properly, you need a trace file with source code location support. The path of the source file can be adjusted in Preferences. A limit for the size of the file can be set, too. Finally, the user can decide if he wants Vampir to automatically check for new versions.

Appearance

In the Appearance settings of the Preferences dialog, there are six different objects for which the color options can be changed: the functions/function groups, markers, counters, collectives, messages and I/O events. Choose an entry and click on its color to make a modification. A color picker dialog opens where it is possible to adjust the color. For messages and collectives, a change of the line width is also available.

Appearance Settings

In order to quickly find the desired item a search box is provided at the bottom of the dialog.

Saving Policy

Vampir detects whenever changes to the various settings are made. In the Saving Policy dialog it is possible to adjust the saving behavior of the different components to your own needs.

Saving Policy Settings

In the dialog Saving Behavior you tell Vampir what to do in the case of changed preferences. The user can choose the categories of settings (e.g., layout) that should be treated. Possible options are that the application automatically Always or Never saves changes. The default option is to have Vampir asking you whether to save or discard changes. Usually the settings are stored in the folder of the tracefile. If the user has no access to it, it is possible to place them in the Application Data Folder. They are listed in the tab Locally Stored Preferences with creation and modification date.

Note: On loading, Vampir favors settings in the Application Data Folder. Default Preferences offers to save preferences of the current trace file as default settings, where they are then used for tracefiles without settings. Another option is to restore the default settings; in this case, the current preferences of the tracefile are reverted.

Footnotes

Additional links that might be of interest to the reader:

... (OTF) http://www.tu-dresden.de/zih/otf

... WindowsHPC http://resourcekit.windowshpc.net/MORE_INFO/TracingMPIApplications.html

... Manual http://www.tu-dresden.de/zih/vampirtrace

Retrieved from "https://wiki.futuregrid.org/index.php/Docs/Performance/Vampir"