Giter VIP home page Giter VIP logo

oshmpi's Introduction

OSHMPI: OpenSHMEM over MPI-3

This project is no longer actively developed.

Bug fixes and specific feature requests will be addressed as time permits. We will try to read and comment on issues and pull requests promptly.

We recommend you use Sandia OpenSHMEM to get the latest OpenSHMEM features. We tried to support OpenSHMEM 1.2 but have no plans to support OpenSHMEM 1.3 or later.

Test Status

We currently test with MPICH 3.2 on Ubuntu.

Build Status

Publications

Jeff R. Hammond, Sayan Ghosh, and Barbara M. Chapman, "Implementing OpenSHMEM using MPI-3 one-sided communication."
Preprint: https://github.com/jeffhammond/oshmpi/blob/master/docs/iwosh-paper.pdf
Workshop Proceedings: http://www.csm.ornl.gov/workshops/openshmem2013/documents/ImplementingOpenSHMEM%20UsingMPI-3.pdf
Journal: http://dx.doi.org/10.1007/978-3-319-05215-1_4

Platform Support

OSHMPI uses essentially all the new RMA features in MPI-3 and thus requires an MPI implementation that supports these properly. Currently, MPICH 3.0.x and its derivatives support MPI-3. We assume support for the UNIFIED memory model of MPI-3; OSHMPI will abort if this is not provided by the implementation, which should only be the case for non-cache-coherent systems.

Because SHMEM allows for communication against remote global variables and not just the symmetric heap, OSHMPI has an operating system dependency since accessing the text and data segments is not portable.

The platforms we currently test on are:

  • Mac with GCC 4.8 or LLVM 3.3+ and MPICH master (see Bugs below)
  • Linux x86_64 with MPICH 3+ and MVAPICH2 1.9+

We have tested on these platforms at one point or another:

  • SGI ccNUMA with MPICH 3

Features

OSHMPI attempts to use MPI-3 as effectively as possible. To this end, we support all valid performance-related info keys and ensure the correct semantics when they are enabled and disabled.

When OSHMPI is used within an SMP, we employ shared-memory windows to bypass MPI in Put, Get and Atomic operations to use only load-store instructions or GCC intrinsics. However, for strided, we still use MPI within an SMP because the lead developer is a lazy bum.

Future Work

We look forward to patches contributing the following:

  • Allow selection of tuning options at runtime.
  • Eliminate all intranode MPI-RMA communication.
  • Cache subcommunicators corresponding to PE subgroups.

Bugs/Omissions

We look forward to patches addressing the following:

  • Mac non-heap symmetric data cannot be accessed remotely reliably.
  • PSHMEM interface
  • Fortran interface
  • OpenSHMEM 1.3 changes

oshmpi's People

Contributors

jeffhammond avatar sg0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

oshmpi's Issues

shmem_pe_accessible needs to check for MPMD

The source TODO comment describes this issue succinctly.

/* 8.3: Accessibility Query Routines */
int shmem_pe_accessible(int pe) 
{ 
    /* TODO: detect MPMD launching, i.e. if PE is running same binary as me.
     *       MPI_APPNUM attribute of MPI_COMM_WORLD (MPI-3 10.5.3) is the way.
     *       Create a window containing these values so that any PE can Get it
     *       and compare against the local value. */
    return ( 0<=pe && pe<=shmem_world_size ); 
} 

Build option --enable-comm-caching breaks barrier microbenchmark

I have a barrier microbenchmark that scales the number of PEs from 2 to NPES. When OSHMPI is built with --enable-comm-caching, this benchmark always breaks (deadlock) at PE_Size=18 (after performing PE_Size={2..17} correctly).

I've reproduced this bug with Intel MPI and MPICH on two different systems including multi-node and SMP configurations.

This bug is not exhibited when OSHMPI is built without --enable-comm-caching. It also may not repro if NPES=18. NPES=18 and PE_Size=18 may convert a shmem_barrier() to a shmem_barrier_all() and not repro this bug. For those implementations, it has to be at least NPES=19.

warn user when OpenSHMEM spec is violated

OSHMPI collectives do not require symmetric variables. This permits users to run incorrect code with OSHMPI, only to fail with other implementations. OSHMPI should check the arguments and emit a warning message when OpenSHMEM semantics are violated.

wait_until violates the MPI standard

From Jim Dinan:

Initial report:

It looks like this implementation of wait_until violates the MPI spec with regard to polling on a location in memory. Am I missing something, or are you relying on the MPI implementation supporting this as an extension?

Elaboration:

I’m thinking of rule #2 on page 456:

  1. Updates from a remote process will appear in the memory of the target, but there are no atomicity or ordering guarantees if more than one byte is updated. Updates are stable in the sense that once data appears in memory of the target, the data remains until replaced by another update. This permits polling on a location for a change from zero to non-zero or for a particular value, but not polling and comparing the relative magnitude of values. Users are cautioned that polling on one memory location and then accessing a different memory location has defined behavior only if the other rules given here and in this chapter are followed.

This effectively allows you to poll on only a byte. If you are expecting bytes to all show up at once, then you’re relying on an extension. For instance, assume a user asks to wait until the value of an integer is 1. Let’s say the user puts 257 = 256 + 1 = 0x 01 01 and least significant byte arrives first. The buffer will temporarily contain 1, even though the integer value should be 257.

include full license for third-party tests

Since we are not redistributing the entire OMB, we need to inline the full license in such source files. Otherwise, we should remove them from OSHMPI and just provide a pointer to the OSU tests of interest in test/README (for example).

/*
 * Copyright (C) 2002-2013 the Network-Based Computing Laboratory
 * (NBCL), The Ohio State University. 
 *
 * Contact: Dr. D. K. Panda ([email protected])
 *
 * For detailed copyright and licensing information, please refer to the
 * copyright file COPYRIGHT in the top level OMB directory.
 */

use GCC intrinsics for SMP-optimized atomics

We convert put and get to load-store within an SMP (when SMP optimizations are active) but not atomics. Using GCC intrinsics (or OPA) will improve the performance of SHMEM atomics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.