Mpi_request_free Is Evil

SERVIDORES

It was pointed out to me that in my last blog post (Don't leak MPI_Requests), I failed to mention the MPI_REQUEST_FREE function.

True enough - I did fail to mention it. But I did so on purpose, because MPI_REQUEST_FREE is evil.

Let me explain...

MPI_REQUEST_FREE is described in MPI-3 section 3.7.3 as:

Mark the request object for deallocation and set request to MPI_REQUEST_NULL. An ongoing communication that is associated with the request will be allowed to complete. The request will be deallocated only after its completion.

Sounds like a pretty good way to fire a non-blocking send and forget about it, right? You could do something like this:

MPI_Isend(buffer, ..., &request);
MPI_Request_free(&request);

Looks great!

...except that it isn't.

What becomes an issue is knowing when the send has completed - when is it safe to edit or free the send buffer? The usual argument is that you can tell when the send finishes by having the receiver send an ACK back to the sender when the message has been received.

Believe it or not,that is not sufficient!

Just because the message has been received at the far side doesnotmean that the sending side has completed all of its internal accounting and stopped using the send buffer.

To be totally clear: MPI may still be using the send buffer,even though the message has been received at the destination.

That may seem totally counter-intuitive, but it's true.

In the above example, consider if the MPI implementation needs to register the buffer before sending it. Remember that registering memory takes time; it's "slow". De-registering memory is even slower. So even after the local network hardware has completed sending the send buffer, the MPI implementation may choose to de-register the memory - which not only takes time, it also likely involves updating critical state in internal MPI implementation tables.

And since the request was freed, the MPI implementation may conclude that completing all work associated with this request is (very) low priority. Since the app doesn't care about this request any more, why not complete all other (non-freed) requests first?

Additionally, this de-registration / updating work may be occurring asynchronously in the user's application - perhaps even outside of the main application thread.

Hence, even if the main application thread receives an ACK from the receiver, the work such as de-registration may not yet be complete. Freeing the memory before that deregistration / updating work is complete could be catastrophic.

The fact of the matter is: if you free an ongoing send request, you have only one guarantee as to when MPI will be finished with that buffer: when MPI_FINALIZE completes.

A similar situation occurs with freeing non-blocking receive requests. How will you know for sure when a) the message has been fully received, and b) the MPI implementation is finished with the receive buffer? Remember that only matching of MPI requests is ordered -you can't know if a message has been fully received, even if a subsequent message of the same signature has been matched.

The moral of the story: only MPI_REQUEST_FREE non-blocking requests if you don't care about doing anything with the buffer until after MPI_FINALIZE returns.

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

S5735-L48T4S-A-V2: Comprehensive Review and Technical Overview

Best 10Gb Switch for SMB in 2025: Unlock Next-Gen Network Performance

S5735-L48T4S-A: Complete Guide with Features, Specifications, and Benefits

S5735-L48P4X-A1: Reliable PoE+ CloudEngine Switch

S5735-L48LP4XE-A-V2: Scalable, Secure, and PoE-Ready for Demanding Enterprise Deployments

S5735-L48LP4S-A-V2 Powers Smarter Campus Networks with Advanced PoE and Cloud Management

S5735-L24T4X-A1 Empowers Installers with Scalable, Reliable, and Efficient Network Access

Best Ethernet Switches for Business (2025): Selection Guide and Top Picks

Huawei S5735-L24T4S-A1: A Compact, Stackable Access Switch Built for the Future

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

MPI_REQUEST_FREE is Evil

Tags quentes : HPC mpi

Ordering Guide

Recursos

Quem somos

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

SERVIDORES

NOTÍCIAS QUENTES

S5735-L48T4S-A-V2: Comprehensive Review and Technical Overview

Best 10Gb Switch for SMB in 2025: Unlock Next-Gen Network Performance

S5735-L48T4S-A: Complete Guide with Features, Specifications, and Benefits

S5735-L48P4X-A1: Reliable PoE+ CloudEngine Switch

S5735-L48LP4XE-A-V2: Scalable, Secure, and PoE-Ready for Demanding Enterprise Deployments

S5735-L48LP4S-A-V2 Powers Smarter Campus Networks with Advanced PoE and Cloud Management

S5735-L24T4X-A1 Empowers Installers with Scalable, Reliable, and Efficient Network Access

Best Ethernet Switches for Business (2025): Selection Guide and Top Picks

Huawei S5735-L24T4S-A1: A Compact, Stackable Access Switch Built for the Future

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

MPI_REQUEST_FREE is Evil

Tags quentes : HPC mpi

Ordering Guide

Recursos

Quem somos

Huawei CloudEngine S5731‑S48P4X Datasheet