Источник: http://www.totalviewtech.com/pdf/MPIStandardsTechArticleFinal.pdf

MPI Debugging Standards… What Took So Long?!

John DelSignore, Jr., CTO
TotalView Technologies
February, 2009

It has been said, "The nice thing about standards is that there are so many of them to choose from." But when it comes to standard debugging interfaces for parallel computers, there aren't enough to choose from! The lack of such standards makes it difficult for tool vendors like TotalView Technologies to support the constantly evolving software environments used on parallel supercomputers, networks of workstations, and clusters.

TotalView Technologies' involvement in High-Performance Computing (HPC) dates back to the mid-1980s, when the TotalView debugger was first created for the BBN Butterfly parallel supercomputer. Back in those days, parallel programming techniques for HPC were still evolving, but by the early 1990s an effort to produce a standard Message Passing Interface (MPI) was under way, and the first MPI standard was completed in May 1994.

The TotalView debugger, an already mature parallel debugger for the BBN Uniform System and Oakridge National Laboratory’s Parallel Virtual Machine, was quickly adapted to support debugging MPI applications. In early 1995, TotalView's Jim Cownie and Argonne National Laboratory’s Bill Gropp and Rusty Lusk decided to join forces and develop debugging interfaces for use with MPICH, one of the first widely available MPI implementations. Two interfaces were developed: one for process discovery and one for message queue extraction. Coined the "MPIR" interfaces, the MPI debugging interfaces eventually became de facto standards implemented by various MPI providers such as Compaq, HP, IBM, LAM, MPI Software Technologies, Open MPI, Quadrics, SCALI, SGI, and other implementations of MPI.

Even though the MPIR debugging interfaces are still widely used today by a number of MPI implementations and tool vendors, MPIR has not yet been standardized. Bill Gropp writes, "...there never was a formal 'MPIR' spec for the process discovery - there was a hack that was created for the first prototype implementation with MPICH and the ch_p4 device, but this was never intended to be a standard. Unfortunately, like so many prototypes, since there wasn't a standard, this part of the implementation was reverse-engineered into other implementations."

Fortunately, this is about to change. As part of the MPI 3.0 Standardization Effort, the MPI 3.0 Tools Support Working Group was formed. Led by Martin Schulz and Bronis de Supinski of Lawrence Livermore National Laboratory, the working group is chartered to provide reliable and portable interfaces for MPI tools with new functionality currently not covered by the MPIR interfaces.

The working group will be focusing in several areas, including tool deployment interfaces and introspection interfaces. The tool deployment interfaces allow the tools to perform process discovery, process acquisition, tool daemon deployment, and topology discovery, and establish inter-tool communication channels. The introspection interfaces allow the tools to "look inside" the MPI implementation and extract valuable debugging and performance information.

The MPI Forum Working Group on Tools is currently operating from the following outline:
Area 1: Tool Deployment Interfaces


Area 2: Introspection Interfaces
Additional ideas - Placement

Twelve years overdue, the creation of formal MPI debugging standards will help foster tools that are capable of keeping pace with the rapidly growing scale and complexity of today’s HPC systems. TotalView Technologies is proud to be one of the original innovators of the highly successfully MPIR interfaces, and is excited about the prospect of contributing to the efforts of the MPI 3.0 Tools Support Working Group.

For more information on the working group, see: http://meetings.mpi-forum.org/mpi3.0_tools.php.
TotalView Technologies provides support for many implementations of MPI, OpenMP, and UPC. See: http://totalviewtech.com/support/tv-v-linux-x86.html.