Rookie HPC

Collectives

Reduction

C | FORTRAN-legacy | FORTRAN-2008

MPI_Reduce_scatter_block

Definition

MPI_Reduce_scatter_block is the means by which MPI processes can apply a reduction followed by a scatter, which may trigger extra optimisations compared to manually issuing an MPI_Reduce followed by an MPI_Scatter. Unlike MPI_Reduce_scatter, MPI_Reduce_scatter_block restricts all blocks to be of the same length. MPI_Reduce_scatter_block is a collective operation; it must be called by every MPI process in the communicator given. Predefined operations are: MPI_MIN, MPI_MAX, MPI_BOR, MPI_BXOR, MPI_LOR, MPI_LXOR, MPI_BAND, MPI_LAND, MPI_SUM and MPI_PROD. Other variants of MPI_Reduce_scatter_block are MPI_Reduce_scatter, MPI_Ireduce_scatter_block, MPI_Ireduce_scatter. Refer to MPI_Ireduce_scatter_block to see the non-blocking counterpart of MPI_Reduce_scatter_block.

Copy

Feedback

int MPI_Reduce_scatter_block(const void* send_buffer,
                             void* receive_buffer,
                             int count,
                             MPI_Datatype datatype,
                             MPI_Op operation,
                             MPI_Comm communicator);

Parameters

send_buffer
A pointer on the buffer to send for reduction.
receive_buffer
A pointer on the buffer in which store the result of the reduction.
count
The number of elements in the send buffer, which is identical to that in the receive buffer as well.
datatype
The type of a buffer element.
operation
The operation to apply to combine messages received in the reduction. This operation must be associative, and commutative for predefined operations while user-defined operations may be non-commutative.
communicator
The communicator in which the reduction takes place.

Returned value

The error code returned from the reduction.

MPI_SUCCESS
The routine successfully completed.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

/**
 * @brief Illustrates how to use a reduce scatter block.
 * @details This application is meant to be run with 3 MPI processes. It
 * consists of a sum reduction; every MPI process has three values to send for
 * reduction. The first values from the MPI process will be reduced and stored 
 * on the MPI process 0. The second values will be reduced separately and stored
 * on MPI process 1, similarly with the third values on MPI process 2. It can be
 * visualised as follows:
 *
 *      +-----------+  +-----------+  +-----------+
 *      | Process 0 |  | Process 1 |  | Process 2 |
 *      +-----------+  +-----------+  +-----------+
 *      |   Values  |  |   Values  |  |   Values  |
 *      +---+---+---+  +---+---+---+  +---+---+---+ 
 *      | 0 | 1 | 2 |  | 3 | 4 | 5 |  | 6 | 7 | 8 |
 *      +---+---+---+  +---+---+---+  +---+---+---+
 *        |    \   \     /   |    \    /   /    |
 *        | ____\___\___/____|_____\__/   /     |
 *        |/     \   \       |      \    /      |
 *        |       \___\_____ | ______\__/       |
 *        |            \    \|/       \         |
 *        |             \____|_________\_______ |
 *        |                  |                 \|
 *        |                  |                  |
 *     +--+--+            +--+--+            +-----+
 *     | SUM |            | SUM |            | SUM |
 *     +-----+            +-----+            +-----+
 *     |  9  |            |  12 |            |  15 |
 *     +--+--+            +--+--+            +--+--+
 *        |                  |                  |      
 *  +-----+-----+      +-----+-----+      +-----+-----+
 *  | Process 0 |      | Process 1 |      | Process 2 |
 *  +-----------+      +-----------+      +-----------+
 *  |   Value   |      |   Value   |      |   Value   |
 *  |     9     |      |     12    |      |     15    |
 *  +-----------+      +-----------+      +-----------+
 **/
int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    // Get the size of the communicator
    int size = 0;
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    if(size != 3)
    {
        printf("This application is meant to be run with 3 MPI processes.\n");
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    // Get my rank
    int my_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    // Defines my values
    int values[3] = {3 * my_rank, 3 * my_rank + 1, 3 * my_rank + 2};

    // Each MPI process sends its values and the buffer to receive the corresponding reduction results
    int reduction_result = 0;
    MPI_Reduce_scatter_block(values, &reduction_result, 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);

    printf("[MPI process %d] The sum I received is %d.\n", my_rank, reduction_result);

    MPI_Finalize();

    return EXIT_SUCCESS;
}