In the current MPICH ABI, optional unsupported datatypes are set to <code class="notra

ABI: How to handle optional datatypes? about mpich HOT 7 OPEN

dalcinl commented on June 2, 2024

ABI: How to handle optional datatypes?

from mpich.

Comments (7)

hzhou commented on June 2, 2024

Both MPI_Type_size and MPI_Type_get_envelope returning MPI_UNDEFINED sounds fine. We can support both.

from mpich.

dalcinl commented on June 2, 2024

Both MPI_Type_size and

Perhaps MPI_Type_size failing out right is better?

If we allow MPI_Type_size to succeed and return MPI_UNDEFINED, why not do the same with MPI_Type_get_[true_]extent? And then, there are other query routines that could be allowed to succeed and return some special output.

Making everything but MPI_Type_get_envelope fail contributes to errors not passing silently, e.g. calls to MPI_Type_size that unconsciously use an negative size value afterwards, leading to potentially disastrous bugs that can slip to production. You can argue that MPI_Type_get_envelope can also be use with little care, but IMHO this is a routine far less used, and the whole datatype decoding thing is so cumbersome that those how use it usually know better the MPI business and are more aware of the gory details.

from mpich.

jeffhammond commented on June 2, 2024

We can think of this another way. For example, MPI_REAL16 is defined to be a 16-byte type. There is no problem with using it in MPI to move 16-byte slabs of memory, even if the Fortran compiler does not support REAL*16 or REAL(kind=REAL128). The problem arises when a reduction is performed because, per the following, this is the one place where the representation matters.

Therefore, it might be reasonable to defer errors until reductions are used, because that's the only place where there is a real problem. I recognize this is problematic for users to detect support for such types, hence it may be necessary to add a utility function to query for optional type support.

I'll note that today we already have a problem in this area because all the MPI implementations are doing arithmetic in C, which means that if a Fortran compiler ever defines arithmetic in a different manner than C, MPI implementations will product incorrect results.

Rationale. Particularly for the longer floating-point types, C and Fortran may use different representations. For example, a Fortran compiler may define a 16-byte REAL type with 33 decimal digits of precision while a C compiler may define a 16-byte long double type that implements an 80-bit (10 byte) extended precision floating point value. Both of these types are 16 bytes long, but they are not interoperable. Thus, these types are defined by Fortran, even though C may define types of the same length. (End of rationale.)

from mpich.

dalcinl commented on June 2, 2024

We can think of this another way. For example, MPI_REAL16 is defined to be a 16-byte type.

What about MPI_INTEGER, MPI_REAL, etc ? These types may be optional/unsupported simply because the MPI implementation does not support Fortran, or was configured without Fortran support. In this case, would you assume the sizes to be the usual ones you get from most Fortran compilers without special flags?

hence it may be necessary to add a utility function to query for optional type support.

I definitely agree we still need something to flag optional types. However, rather than adding a new API, my proposal of "reusing" MPI_Type_get_envelope and combiner=MPI_UNDEFINED is an equivalent alternative that can be implemented right now on top of the current MPI 4.1 standard. You may find the semantics contrived and not like it. I'm just trying to prevent API explosion.

from mpich.

jeffhammond commented on June 2, 2024

You're right, although the propsoed MPI_ABI_DETAILS seems like a reasonable way to determine the Fortran ABI, no?

Because the size of the default integer and real types isn't specified in the ABI, those can't be used at all when they are unsupported.

from mpich.

dalcinl commented on June 2, 2024

You're right, although the propsoed MPI_ABI_DETAILS seems like a reasonable way to determine the Fortran ABI, no?

No idea what are you talking about 😥.

Because the size of the default integer and real types isn't specified in the ABI, those can't be used at all when they are unsupported.

Then back to my original point, even if We can think of this another way and defer errors as you said, in the end we would still need a runtime mechanism to flag optional datatypes.

from mpich.

hzhou commented on June 2, 2024

Therefore, it might be reasonable to defer errors until reductions are used, because that's the only place where there is a real problem. I recognize this is problematic for users to detect support for such types, hence it may be necessary to add a utility function to query for optional type support.

I like this. Now I recall this was my thinking several years ago :) . We can always define a type size and always define the datatype. The type size may be inaccurate if the supported language/compiler, e.g. Fortran, do not support it, but that shouldn't cause user issues since users shouldn't be using it from the unsupported languages anyway. The reduction should fall for such datatypes when the implementation can't do it -- this is the current behavior anyway.

This removes the question of datatype availability altogether. If it compiles, it is available. Whether reduction is supported is a separate question.

from mpich.

ABI: How to handle optional datatypes? about mpich HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent