Rookie HPC

Collectives

Reduction

Non-blocking

C | FORTRAN-legacy | FORTRAN-2008

MPI_Ireduce_scatter

Definition

MPI_Ireduce_scatter is the means by which MPI processes can apply a reduction followed by a variable scatter, which may trigger extra optimisations compared to manually issuing an MPI_Reduce followed by an MPI_Scatterv. Unlike its blocking counterpart MPI_Reduce_scatter, MPI_Ireduce_scatter will not block until the operation completed. In other words, when MPI_Ireduce_scatter returns, the buffers passed may not have been exchanged yet, and it must be considered unsafe to reuse the buffers passed. The user must therefore check for completion with MPI_Wait or MPI_Test before safely reusing the buffer passed. Also, unlike MPI_Ireduce_scatter_block, MPI_Ireduce_scatter allows blocks to be of different lengths. MPI_Ireduce_scatter is a collective operation; it must be called by every MPI process in the communicator given. Predefined operations are: MPI_MIN, MPI_MAX, MPI_BOR, MPI_BXOR, MPI_LOR, MPI_LXOR, MPI_BAND, MPI_LAND, MPI_SUM and MPI_PROD. Other variants of MPI_Ireduce_scatter are MPI_Reduce_scatter, MPI_Reduce_scatter_block, MPI_Ireduce_scatter_block. Refer to MPI_Reduce_scatter to see the blocking counterpart of MPI_Ireduce_scatter.

Copy

Feedback

int MPI_Ireduce_scatter(const void* send_buffer,
                        void* receive_buffer,
                        int* counts,
                        MPI_Datatype datatype,
                        MPI_Op operation,
                        MPI_Comm communicator,
                        MPI_Request* request);

Parameters

send_buffer
A pointer on the buffer to send for reduction.
receive_buffer
A pointer on the buffer in which store the result of the reduction.
counts
The array containing the number of elements per block in the send buffer, which is identical to that in the receive buffer as well.
datatype
The type of a buffer element.
operation
The operation to apply to combine messages received in the reduction. This operation must be associative, and commutative for predefined operations while user-defined operations may be non-commutative.
communicator
The communicator in which the reduction takes place.
request
The variable in which store the handler on the non-blocking operation.

Returned value

The error code returned from the reduction.

MPI_SUCCESS
The routine successfully completed.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

/**
 * @brief Illustrates how to use a non-blocking reduce scatter block.
 * @details This application is meant to be run with 3 MPI processes. It
 * consists of a sum reduction; every MPI process has four values to send for
 * reduction. The first values from the MPI process will be reduced and stored 
 * on the MPI process 0. The second and third values will be reduced separately 
 * and stored on MPI process 1, similarly with the fourth values on MPI process
 * 2. It can be visualised as follows:
 *
 *      +---------------+  +---------------+  +---------------+
 *      |   Process 0   |  |   Process 1   |  |   Process 2   |
 *      +---------------+  +---------------+  +---------------+
 *      |     Values    |  |     Values    |  |     Values    |
 *      +---+---+---+---+  +---+---+---+---+  +---+---+---+---+
 *      | 0 | 1 | 2 | 3 |  | 4 | 5 | 6 | 7 |  | 8 | 9 | 10| 11|
 *      +---+---+---+---+  +---+---+---+---+  +---+---+---+---+
 *        |    \   \   \     /  |     |  \      /   /   /   |
 *        | ____\___\___\___/___|_____|___\____/   /   /    |
 *        |/     \   \   \      |     |    \      /   /     |
 *        |       \___\___\____ | ____|_____\____/   /      |
 *        |            \   \   \|/    |      \      /       |
 *        |             \___\___|____ | ______\____/        |
 *        |                  \  |    \|/       \            |
 *        |                   \_|_____|_________\__________ |
 *        |                     |     |                    \|
 *        |                     |     |                     |
 *     +--+--+                +-+---+---+-+              +--+--+
 *     | SUM |                | SUM | SUM |              | SUM |
 *     +-----+                +-----+-----+              +-----+
 *     |  12 |                |  15 |  18 |              |  21 |
 *     +--+--+                +--+--+--+--+              +--+--+
 *        |                      |     |                    |      
 *  +-----+-----+             +--+-----+--+           +-----+-----+
 *  | Process 0 |             | Process 1 |           | Process 2 |
 *  +-----------+             +-----------+           +-----------+
 *  |   Value   |             |   Values  |           |   Value   |
 *  +-----------+             +-----+-----+           +-----------+
 *  |     13    |             |  16 |  20 |           |     23    |
 *  +-----------+             +-----+-----+           +-----------+
 **/
int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    // Get the size of the communicator
    int size = 0;
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    if(size != 3)
    {
        printf("This application is meant to be run with 3 MPI processes.\n");
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    // Get my rank
    int my_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    // Defines my values
    int values[4] = {4 * my_rank, 4 * my_rank + 1, 4 * my_rank + 2, 4 * my_rank + 3};

    // Define the block lengths
    int counts[3] = {1, 2, 1};

    if(my_rank == 1)
    {
        // Each MPI process sends its values and the buffer to receive the corresponding reduction variables
        int reduction_results[2];
        MPI_Request request;
        MPI_Ireduce_scatter(values, reduction_results, counts, MPI_INT, MPI_SUM, MPI_COMM_WORLD, &request);

        // Do some job while it progresses
        // ...

        // Wait for the MPI_Ireduce_scatter to complete
        MPI_Wait(&request, MPI_STATUS_IGNORE);
        printf("[MPI process %d] The sum I received are %d and %d.\n", my_rank, reduction_results[0], reduction_results[1]);
    }
    else
    {
        // Each MPI process sends its values and the buffer to receive the corresponding reduction variables
        int reduction_result;
        MPI_Request request;
        MPI_Ireduce_scatter(values, &reduction_result, counts, MPI_INT, MPI_SUM, MPI_COMM_WORLD, &request);

        // Do some job while it progresses
        // ...

        // Wait for the MPI_Ireduce_scatter to complete
        MPI_Wait(&request, MPI_STATUS_IGNORE);
        printf("[MPI process %d] The sum I received is %d.\n", my_rank, reduction_result);
    }

    MPI_Finalize();

    return EXIT_SUCCESS;
}