Open MPI recently revamped its entire run-time parameter system (a.k.a.,"MCA parameter system") as part of its implementation effort for the "MPI_T" interface from MPI-3.
The MPI_T interface is a standardized interface designed for MPI tools, but can be used by regular MPI application programs, too.
Specifically, MPI_T provides programatic access to two types of MPI implementation data:
Open MPI has long had an "MCA parameter" system, which allow users to tweak Open MPI's behavior through mpirun CLI options, environment variables, andtext files.
Nathan Hjelm, from Los Alamos National Laboratory, did the bulk of the work recently to both revamp the internals of Open MPI's MCA parameter system as well as tie it directly to the MPI_T control variable interface. As a result of this work, MPI_T control variablesareMCA parameters (and vice versa).
(Note that all of the MPI_T control variable work described in this blog entry
will be included in the upcoming Open MPI 1.7.3 release)
One of the features we're very excited about is the ability to assign an MPI_T-defined "level" to each control variable.
Specifically, Open MPI has a bazillion control variables (a.k.a., MCA parameters). This is both a curse and a blessing: it's a blessing because power users can tweak just about anything of Open MPI's behavior. It's a curse because the sheer number of knobs to turn is bewildering to new users.
With Nathan's new implementation, we can assign an MPI_T "level" to each control variable indicating for whom the variable is targeted. There are three main levels:
Each of these three main levels has three sub-levels (basic, advanced, and all), allowing a gradation of leveling. For example, Open MPI intentionally puts very, very few parameters in the End user/Basic category so that new users will a) likely see only the control variables that they need for correctness, and b) not be frustrated with needing to sort through a bazillion total control variables to find the ones they want.
This past week, we started limiting the output of the ompi_info command: it now only shows End user/basic parameters by default. For example, by default, you now only see a few control variables for any given network transport - only two for TCP:
$ompi_info --param btl tcpMCA btl: parameter "btl_tcp_if_include" (current value: "", data source: default value, level: 1 user/basic) Comma-delimited list of devices and/or CIDR notation of networks to use for MPI communication (e.g., "eth0,192.168.0.0/16"). Mutually exclusive with btl_tcp_if_exclude.MCA btl: parameter "btl_tcp_if_exclude" (current value: "127.0.0.1/8,sppp", data source: default value, level: 1 user/basic) Comma-delimited list of devices and/or CIDR notation of networks to NOT use for MPI communication -- all devices not matching these specifications will be used (e.g., "eth0,192.168.0.0/16"). If set to a non-default value, it is mutually exclusive with btl_tcp_if_include.
If you want to see many more control variables, use the -level option. In this example, we ask for App tuner/All:
$ompi_info --param btl tcp --level 6MCA btl: parameter "btl_tcp_if_include" (current value: "", data source: [...and 20 more TCP-related control variables...]
Open MPI also contains support for an older performance metric introspection system called PERUSE. Unfortunately no other MPI implementation implemented PERUSE, so the PERUSE effort died.
That being said, as part of our MPI_T work, Nathan is working on updating / converting / revamping our PERUSE-based performance metrics to the new MPI_T performance variables.
This work is ongoing, but is looking very promising.