Comments (4)
I saw your question on MPICH mailing list, but it was slightly different than what you asked on OMPI user mailing list or what you state on this issue. Because, while MPICH does have some level of support for zero-copy over XPMEM that is not true for structs (which is the exact type you asked about on OMPI user mailing list) nor does provide similar support when UCX is used as underlying transport (for intra-node communications).
Thus, even if we add node-level support for zero-copy over XPMEM as soon as you will enable UCX that will be useless.
from ompi.
Thanks, I see. Is UCX itself unable to do non-contiguous zero-copy (either intra-node or inter-node), or is it not supported on the OMPI/MPICH side?
Not having zero-copy for non-contiguous sends means certain data structures need to be split into many requests.
For example, if you have this:
struct S {
std::vector<float> vec_f; // size 10k
std::vector<int> vec_i; // size 10k
std::vector<double> vec_d; // size 10k
std::vector<char> vec_c; // size 10k
};
std::vector<S> vec_s; // size 100
Being able to send it in 1 request instead of 400 (one for each contiguous buffer) seems like it could be quite advantageous, for performance and ease of use.
Even if there are restrictions, e.g. the buffer datatypes must be MPI_BYTE
(i.e. assuming the sender and receiver have the same architecture / homogeneous cluster).
At the moment sending such a datatype in 1 request with a struct results in packing/unpacking, which is very slow for large buffers, so much so that it is significantly slower than sending 400 zero-copy requests as in the aforementioned example.
You mentioned in the mailing list that doing it in 1 zero-copy request is a complex operation that would be expensive and add significant latency.
But it seems like it could at least be an improvement over 400 separate zero-copy requests?
Other use cases could be sending many sub-views of a large 2D array, or sending map-like/tree-like types.
from ompi.
@pascal-boeschoten-hapteon UCX API currently does not optimize (for example zero-copy over RDMA or XPMEM) for complex datatypes, even though this is something that was discussed and considered to add.
from ompi.
@yosefe Thanks, that's good to know.
@bosilca To give a bit more context, we've observed that when sending many large and complex data structures (similar to the one in the example above) to many other ranks, it's significantly slower to have many small zero-copy requests vs one big request with packing/unpacking. The sheer volume of requests seems to be the bottleneck and we've seen up to a factor of 5 difference in throughput. But when it's just 1 rank sending to 1 other rank, the many small zero-copy requests are faster, as the packing/unpacking becomes limited by memory bandwidth. It should mean that if we could have one big zero-copy request, the performance gain in the congested case would be very significant.
from ompi.
Related Issues (20)
- when i run mpi program using ASAN, asan reports some memory leaks HOT 1
- Error when using MPI_Comm_spawn with ULFM enabled HOT 6
- MPI_Status_f082f not part of the mpi_f08 interface HOT 13
- coll_tuned_dynamic_rules_filename option no way to set alltoall_algorithm_max_requests from the rules file
- coll_tuned_use_dynamic_rules wrong scoping for tools interface
- Fflush(stdout) doesn't work as expected. HOT 6
- small array of derived data type(in Fortran) can be sent by MPI_Isend and MPI_Irecv but it ran into errors when I augment the array HOT 4
- DVM environment variable? HOT 3
- Error while building from source openmpi 5.0.3 HOT 1
- Fault tolerant error when re spawn process in mpiexec in remote node
- fortran .mod files installed in libdir instead of includedir HOT 20
- PMIX_ERROR when MPI_Comm_spawn in multiple nodes HOT 11
- coll tuned alltoall algorithm ignored after initialization
- Build fail on Mac M3 with macOS clang 15 HOT 1
- Mystery error on exit HOT 4
- pkgconfig files not installed with `--enable-script-wrapper-compilers` HOT 2
- mpirun nccl-test hang HOT 1
- cannot MPI_File_open a one-character filename, deletes external file anyways HOT 3
- Reduce_local Segmentation fault when Running with IMB-MPI1 built for GPU HOT 2
- mpirun 5.0.3 has bug on parse shell args while 4.1.6 works well. HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ompi.