One-sided

C | Fortran-2008 | Fortran-90

MPI_Win_create

Definition

MPI_Win_create is a first phase in using one-sided communications in MPI. MPI_Win_create allows each process to specify a window in its memory that can be accessed one-sidedly from remote processes. The call returns an opaque object that represents the attributes of each window (as specified by the initialisation call), and the group of processes that own and access the set of windows. A requirement of MPI_Win_create is that the memory space exposed through the window must be already allocated, unlike MPI_Win_allocate where the window is allocated automatically. MPI_Win_create is a collective operation; it must be called on all MPI processes in the communicator concerned. Windows created must be freed with MPI_Win_free once all pending RMA communications with that window are complete. Other variants of MPI_Win_create are MPI_Win_create_dynamic, MPI_Win_allocate and MPI_Win_allocate_shared.

Copy

Feedback

int MPI_Win_create(void* base,
                   MPI_Aint size,
                   int displacement_unit,
                   MPI_Info info,
                   MPI_Comm communicator,
                   MPI_Win* window);

Parameters

base

The address of the start of the memory space making the window.

size

The size of the memory area exposed through the window, in bytes.

displacement_unit

The displacement unit is used to provide an indexing feature during RMA operations. Indeed, the target displacement specified during RMA operations is multiplied by the displacement unit on that target. The displacement unit is expressed in bytes, so that it remains identical in an heterogeneous environment.

info

The info argument provides optimisation hints to the runtime about the expected usage pattern of the window.

  • no_locks: if set to true, then the implementation may assume that passive target synchronisation (that is, MPI_Win_lock, MPI_Win_lock_all) will not be used on the given window. This implies that this window is not used for 3-party communication, and RMA can be implemented with no (less) asynchronous agent activity at this process.
  • accumulate_ordering: controls the ordering of accumulate operations at the target. The default value is rar,raw,war,waw.
  • accumulate_ops: if set to same_op, the implementation will assume that all concurrent accumulate calls to the same target address will use the same operation. If set to same_op_no_op, then the implementation will assume that all concurrent accumulate calls to the same target address will use the same operation or MPI_NO_OP. This can eliminate the need to protect access for certain operation types where the hardware can guarantee atomicity. The default is same_op_no_op.
  • same_size: if set to true, then the implementation may assume that the argument size is identical on all processes, and that all processes have provided this info key with the same value.
  • same_disp_unit: if set to true, then the implementation may assume that the argument displacement_unit is identical on all processes, and that all processes have provided this info key with the same value.
communicator

The communicator containing all MPI processes involved in RMA communications. The various processes in the corresponding group may specify completely different target windows, in location, size, displacement units and info arguments. As long as all the get, put and accumulate accesses to a particular process fit their specific target window this should pose no problem. The same area in memory may appear in multiple windows, each associated with a different window object. However, concurrent communications to distinct, overlapping windows may lead to undefined results.

window

A pointer to the variable in which store the window created.

Return value

The error code returned from the window creation.

Example

Copy

Feedback

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

/**
 * @brief Illustrate how to create a window.
 * @details This application consists of two MPI processes. MPI process 1
 * exposes a window containing 2 integers. The first one is initialised to 0 and
 * will be overwritten by MPI process 0 via MPI_Put to become 12345. The second
 * will be initialised to 67890 and will be read by MPI process 0 via MPI_Get.
 * After these two commands are issued, synchronisation takes place via
 * MPI_Win_fence and each MPI process prints the value that came from the other
 * peer.
 **/
int main(int argc, char* argv[])
{
    MPI_Init(&argc, &argv);

    // Check that only 2 MPI processes are spawn
    int comm_size;
    MPI_Comm_size(MPI_COMM_WORLD, &comm_size);
    if(comm_size != 2)
    {
        printf("This application is meant to be run with 2 MPI processes, not %d.\n", comm_size);
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    // Get my rank
    int my_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

    // Create the window
    const int ARRAY_SIZE = 2;
    int window_buffer[ARRAY_SIZE];
    MPI_Win window;
    if(my_rank == 1)
    {
        window_buffer[1] = 67890;
    }
    MPI_Win_create(window_buffer, ARRAY_SIZE * sizeof(int), sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &window);
    MPI_Win_fence(0, window);

    int remote_value;
    if(my_rank == 0)
    {
        // Fetch the second integer in MPI process 1 window
        MPI_Get(&remote_value, 1, MPI_INT, 1, 1, 1, MPI_INT, window);

        // Push my value into the first integer in MPI process 1 window
        int my_value = 12345;
        MPI_Put(&my_value, 1, MPI_INT, 1, 0, 1, MPI_INT, window);
    }

    // Wait for the MPI_Get and MPI_Put issued to complete before going any further
    MPI_Win_fence(0, window);

    if(my_rank == 0)
    {
        printf("[MPI process 0] Value fetched from MPI process 1 window_buffer[1]: %d.\n", remote_value);
    }
    else
    {
        printf("[MPI process 1] Value put in my window_buffer[0] by MPI process 0: %d.\n", window_buffer[0]);
    }

    // Destroy the window
    MPI_Win_free(&window);

    MPI_Finalize();

    return EXIT_SUCCESS;
}