microsoft / microsoft-mpi Goto Github PK
View Code? Open in Web Editor NEWMicrosoft MPI
License: MIT License
Microsoft MPI
License: MIT License
Hello,
When I link to MSMPI (I am using the MSMPI SDK installed from the installer) using the ifort compiler in Visual Studio, I receive the following warnings:
LINK : warning LNK4098: defaultlib 'LIBCMT' conflicts with use of other libs; use /NODEFAULTLIB:library
1>msmpifec.lib(mpifbind.obj) : warning LNK4099: PDB 'msmpifec.pdb' was not found with 'msmpifec.lib(mpifbind.obj)' or at 'XXXX'; linking object as if no debug info
Both of these warnings appear to be related to linking to msmpifec.lib (running dumpbin /directives msmpifec.lib shows the linker directive /DEFAULTLIB:LIBCMT and some other dumpbin operations indicate that the .pdb file is expected at X:\bt\1070745\repo\src\mpi\msmpi\fortran\msmpifec\obj\amd64\msmpifec.pdb which does not come with the SDK).
What is the recommended approach to avoiding these warnings (do I need to build this repo myself to produce the correct .pdb file and to set the runtime library to multithreaded DLL)?
Thanks.
There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.
Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.
Not sure what's the proper channel to communicate this problem. Because there is no contact info on the website, I'm opening an issue here.
The download links to MS-MPI 10.1.1 and 10.1.2 on the official Release Notes are invalid. It now says "We're sorry, this download is no longer available." Please fix it, thank you.
ps. The link to 10.0 is still alive.
$ gfortran -c -I$MINGW_PREFIX/include "$MSMPI_INC/mpi.f90"
C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/mpi.f90:404:32:
404 | PARAMETER (MPI_FLOAT_INT=z'8c000000')
| 1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1). This check can be disabled with the option '-fno-range-check'
C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/mpi.f90:406:33:
406 | PARAMETER (MPI_DOUBLE_INT=z'8c000001')
| 1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1). This check can be disabled with the option '-fno-range-check'
C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/mpi.f90:408:31:
408 | PARAMETER (MPI_LONG_INT=z'8c000002')
| 1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1). This check can be disabled with the option '-fno-range-check'
C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/mpi.f90:410:32:
410 | PARAMETER (MPI_SHORT_INT=z'8c000003')
| 1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1). This check can be disabled with the option '-fno-range-check'
C:\Program Files (x86)\Microsoft SDKs\MPI\Include\/mpi.f90:412:38:
412 | PARAMETER (MPI_LONG_DOUBLE_INT=z'8c000004')
| 1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1). This check can be disabled with the option '-fno-range-check'
I have unexpected results with fairly complex usage of MPI_File_set_view
.
With some experimenting I managed to minimize test case to this:
#include <mpi.h>
#include <cstdio>
inline void check_error(int err)
{
if(!err)return; int len;
char buf[MPI_MAX_ERROR_STRING];
MPI_Error_string(err, buf, &len);
std::printf("MPI error: %.*s\n", len, buf);
}
int main(int n, char **arg)
{
check_error(MPI_Init(&n, &arg));
int current_node, node_count;
check_error(MPI_Comm_rank(MPI_COMM_WORLD, ¤t_node));
check_error(MPI_Comm_size(MPI_COMM_WORLD, &node_count));
MPI_Datatype type;
int sizes[] = {1, 1, 1};
MPI_Aint offsets[] = {0, node_count + current_node, current_node};
MPI_Datatype types[] = {MPI_LB, MPI_BYTE, MPI_BYTE};
check_error(MPI_Type_create_struct(3, sizes, offsets, types, &type));
check_error(MPI_Type_commit(&type));
MPI_File file;
check_error(MPI_File_open(MPI_COMM_WORLD, "test.txt",
MPI_MODE_WRONLY | MPI_MODE_CREATE, MPI_INFO_NULL, &file));
check_error(MPI_File_set_size(file, 0));
MPI_Status status; int pos = 4;
if(!current_node)check_error(MPI_File_write(file, "::::", pos, MPI_BYTE, &status));
check_error(MPI_File_set_view(file, pos, MPI_BYTE, type, "native", MPI_INFO_NULL));
char buf[2];
buf[0] = 'a' + current_node; buf[1] = 'A' + current_node;
check_error(MPI_File_write(file, buf, 2, MPI_BYTE, &status));
check_error(MPI_File_close(&file));
check_error(MPI_Type_free(&type));
check_error(MPI_Finalize());
}
I expect that after running at 4 nodes test.txt
would contain ::::ABCDabcd
and with another MPI implementation it's indeed the case. But under MS-MPI I have
MPI error: Other I/O error, error stack:
Other I/O error Invalid access to memory location.
And with 1 node I even have hang up.
When I use direct element order instead of reverse (swap last two elements of array offsets
) I have slightly better result of abcdABCD
but first characters get wiped out.
When an error code is generated with MPIR_Err_create_code_valist
the Visual Studio debugger will stop. This behavior is quite annoying for us because we are trying to open a file with mpio and it is a legitimate scenario that this fails because we don't have the user rights to open the file.
We only use msmpi by linking against the release DLL thus we consider this behavor to be unexpected.
This is the line of code:
Microsoft-MPI/src/mpi/common/errutil.cpp
Line 343 in 7ff6bdc
#if DBG [...] #endif
to at least only have the debugger stop when msmpi was build in debug mode.Does Microsoft MPI support multi-rail on InfiniBand?
On Linux, you can use OpenMPI + UCX.
On a dual Mellanox ConnectX-6 setup, I get the following bandwith benchmark results:
Thanks!
I config my vscode following this blog , and I changed the “code run - setting.json” like
"cpp": "cd $dir && g++ $fileName -o $fileNameWithoutExt -fopenmp -l msmpi -L \"C:\\Program Files (x86)\\Microsoft SDKs\\MPI\\Lib\\x64\" -I \"C:\\Program Files (x86)\\Microsoft SDKs\\MPI\\Include\" && mpiexec -n 4 $dir$fileNameWithoutExt",
Is that wrong? I tried to include mpi.h but something wrong happened, like
C:\Users\PC\AppData\Local\Temp\ccOw0hGm.o:main.cpp:(.text+0x4c): undefined reference to `MPI_Init@8'
C:\Users\PC\AppData\Local\Temp\ccOw0hGm.o:main.cpp:(.text+0x62): undefined reference to `MPI_Comm_rank@8'
C:\Users\PC\AppData\Local\Temp\ccOw0hGm.o:main.cpp:(.text+0x78): undefined reference to `MPI_Comm_size@8'
C:\Users\PC\AppData\Local\Temp\ccOw0hGm.o:main.cpp:(.text+0x90): undefined reference to `MPI_Get_processor_name@8'
C:\Users\PC\AppData\Local\Temp\ccOw0hGm.o:main.cpp:(.text+0x115): undefined reference to `MPI_Finalize@0'
collect2.exe: error: ld returned 1 exit status
and the code #include <mpi.h>
in main.cpp
also showed the error
#include errors detected. Please update your includePath. Squiggles are disabled for this translation unit.
cannot open source file "mpi.h"
emm... maybe this package can only run in Visial Studio 2017??
I tried to include it in clion(cmake) firstly but I failed, either.
I just don't want to install VS due to its huge size.
Official installer has version 10.1.2, but the github repo here does not have a tag with 10.1.2.
Hi, thanks for the open source project. When I trying to compile from source code using VisualStudio 2017, the solution always prompts: could not find the header
#include "MpiTraceEvents.h" (mpitrace.h)
Would you please provide the way to generate this header file?
Older versions of MSVC (<1600) does not have stdint, and it throws compilation errors as it is included in mpi.h.
Is there a resolution for the crash due to the memory build-up issue as quoted on this page?
For completed blocking send/recv, is it possible to recycle/free their buffer memories?
I am trying to establish a cluster of multiple windows machine connected in a LAN using MSMPI. I have installed msmpi of the same version, have same username and password and ran smpd -p 8676 on both machines.
I made a simple program to establish a communication between two machines using mpi_comm_spawn. I used the command line as mpiexec -host -p 8676 sample.exe. The issue is the parent and child are being compulsorily created in the same machine(either local or remote). How to make the communication such that the parent runs on the local machine and the child on remote machine.
Can you please guide on how to establish the communication between both the machines.
Azure Pipelines should be used for continuous integration of MS MPI.
Hi,
I installed MSMPI via pacman (pacman -S mingw-w64-x86_64-msmpi) but mpiexec command is missing and MSYS Mingw64 says that command not found though mpicc/mpic++ are there.
At the moment, vcpkg
downloads prebuilt libraries for msmpi
. It is really one of the very few ports doing this, because the fundamental idea is to build everything on the developer machine.
Now that the project is open source, do you think it would ever be possible to ask for an "official" port, made by the msmpi developers themselves, in order to use the best known build configuration?
$ gfortran -o example1 example.F90 -fno-range-check -lmsmpifec -lmsmpi -L"$MSMPI_LIB64" -I$MINGW_PREFIX/include
Warning: corrupt .drectve at end of def file
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x35): undefined reference to `__guard_check_icall_fptr'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x36): undefined reference to `__guard_check_icall_fptr'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x21): undefined reference to `__guard_check_icall_fptr'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x21): undefined reference to `__guard_check_icall_fptr'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x17): undefined reference to `__guard_check_icall_fptr'
C:/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: C:\Program Files (x86)\Microsoft SDKs\MPI\Lib\x64\/msmpifec.lib(E:/bt/973263/repo/src/mpi/msmpi/fortran/msmpifec/obj/amd64/mpifbind.obj):(.text$mn+0x17): more undefined references to `__guard_check_icall_fptr' follow
collect2.exe: error: ld returned 1 exit status
It looks like this is caused by control flow guard being enabled since 10.0.
To reproduce, inside an MSYS MinGW 64-bit shell:
mkdir /tmp/msmpi && cd /tmp/msmpi
cp "$MSMPI_INC/mpi.h" $MINGW_PREFIX/include
cp "$MSMPI_INC/mpif.h" $MINGW_PREFIX/include
cp "$MSMPI_INC/x64/mpifptr.h" $MINGW_PREFIX/include
gfortran -c -fno-range-check -I$MINGW_PREFIX/include "$MSMPI_INC/mpi.f90"
cp mpi.mod $MINGW_PREFIX/include
mkdir /tmp/mpi_test && cd /tmp/mpi_test
wget https://raw.githubusercontent.com/coderefinery/autocmake/c5057f8aee65/test/fc_mpi/src/example.F90
gfortran -o example1 example.F90 -fno-range-check -lmsmpifec -lmsmpi -L"$MSMPI_LIB64" -I$MINGW_PREFIX/include
First of all, thanks a lot for making msmpi
Open Source. I think this will drive the project much farther in adoption and faster than ever, together with the renewed developer friendliness Microsoft is so strongly adopting in many projects: starting from the huge work on MSVC, continuing with the vcpkg project, finally using MS toolchain is possible also in scientific HPC computing!
I'd really like to use the new 10.0 version in a project I am supervising, ALaDyn, which is a particle-in-cell code mainly written in fortran and which uses MPI to scale on thousands of cores. Actually the Windows codebase is only used for debug, never in production, but things may change really soon and we would like to be ready for that moment.
Setting apart cygwin, which I'd not consider a native windows build, we support building the code on Windows using MSMPI and the PGI 18.10 Fortran Compiler (community edition is enough), together with Visual Studio 2017 and vcpkg. We never tested Intel Compiler on windows, nor IntelMPI, for now (we may have to do it even faster than predicted, if we will be stuck on any problem with the current toolchain).
The problem I am facing is that it is impossible to build the MPI part using this "new" code.
In fact, PGI compiler includes an older version of MSMPI (HPC 2012, v4.1), which works perfectly and which is the only MPI distribution we support on Windows. If I use the freshly released 10.0 version (or even the slightly older 9.1 version) the compiler throws away internal errors and refuses to continue.
May I ask you why? For now, we couldn't find any possible solution. Shall I ask to PGI? But since the compiler (18.10 is their latest version, fully compatible with VS2017) is working on an older version of MSMPI, do you consider this a regression on your side? Which Fortran compilers are actually supported by MSMPI?
Thanks a lot! I am also eager to help this project in any way that would be possible for me!
(Originally posted on Windows HPC MPI Forum)
Very old versions of mpich (1.x) used the z'8c000000' notation to represent integers. This has since been changed so that regular decimal numbers are used, in this case -1946157056. MS MPI still uses the old way.
Compiling an application with gfortran and the MS MPI mpif.h header leads to this error:
C:\Program Files (x86)\Microsoft SDKs\MPI\Include/mpif.h:399:32:
PARAMETER (MPI_FLOAT_INT=z'8c000000')
1
Error: Arithmetic overflow converting INTEGER(16) to INTEGER(4) at (1).
This check can be disabled with the option '-fno-range-check'
The issue here is that z'8c000000' initially represents 2348810242 which is bigger than 32bit (=Integer) and wouldn't fit. By adding the -fno-range-check option this value overflows into the negative space and ends up at the right number. However, this is a very bad approach, since it means that you have to enable this flag for all source files that include the mpif.h header and essentially remove valuable error messages.
Can MS MPI be updated to not use this non-conformant behaviour so that applications built with gfortran and MS MPI don't have to use the mentioned flag?
My Visual Studio 2022 Community is giving me an error when I run my application saying msmpi.dll
is not found. I followed the instructions here and installed the latest version of Microsoft MPI v10.1.2 from Microsoft. .lib
files are installed but there is no msmpi.dll
.
Where can I get msmpi.dll
?
Installation folder of msmpisdk.msi
:
PS C:\Program Files (x86)\Microsoft SDKs\MPI> tree /f
Folder PATH listing for volume Windows
Volume serial number is 5C57-9F08
C:.
├───Include
│ │ mpi.f90
│ │ mpi.h
│ │ mpif.h
│ │ mpio.h
│ │ mspms.h
│ │ pmidbg.h
│ │
│ ├───x64
│ │ mpifptr.h
│ │
│ └───x86
│ mpifptr.h
│
├───Lib
│ ├───x64
│ │ msmpi.lib
│ │ msmpifec.lib
│ │ msmpifmc.lib
│ │
│ └───x86
│ msmpi.lib
│ msmpifec.lib
│ msmpifes.lib
│ msmpifmc.lib
│ msmpifms.lib
│
└───License
MicrosoftMPI-SDK-EULA.rtf
MPI-SDK-TPN.txt
The installation folder of msmpisetup.exe
:
PS C:\Program Files\Microsoft MPI> tree /f
Folder PATH listing for volume Windows
Volume serial number is 5C57-9F08
C:.
├───Benchmarks
│ IMB-MPI1.exe
│ IMB-NBC.exe
│ MpiPingpong.exe
│
├───Bin
│ mpiexec.exe
│ mpitrace.man
│ msmpilaunchsvc.exe
│ smpd.exe
│
├───License
│ MicrosoftMPI-Redistributable-EULA.rtf
│ MPI-Redistributables-TPN.txt
│
└───Redist
MSMpiSetup.exe
The following snippet, previously discussed on SO and validated with MPICH, OpenMPI and Intel MPI is not working properly with MS-MPI 10.0.12498.5.
In a thread, a loop continuously listen to requests with consecutive calls to MPI_Irecv
and MPI_Wait
. To exit the loop cleanly, MPI_Cancel
is called from another thread to cancel the request. According to the standard:
If a communication is marked for cancellation, then a
MPI_WAIT
call for that communication is guaranteed to return.
#include <mpi.h>
#include <iostream>
#include <future>
using namespace std::literals::chrono_literals;
void async_cancel(MPI_Request *request)
{
std::this_thread::sleep_for(1s);
std::cout << "Before MPI_Cancel" << std::endl;
int res = MPI_Cancel(request);
if (res != MPI_SUCCESS)
std::cerr << "MPI_Cancel failed" << std::endl;
std::cout << "After MPI_Cancel" << std::endl;
}
int main(int argc, char* argv[])
{
int provided;
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
if (provided != MPI_THREAD_MULTIPLE)
std::cout << "MPI_Init_thread could not provide MPI_THREAD_MULTIPLE" << std::endl;
int rank, numprocs;
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Request request;
MPI_Status status;
int buffer;
if (rank == 0)
{
MPI_Irecv(&buffer, 1, MPI_INT, 1, 123, MPI_COMM_WORLD, &request);
auto res = std::async(std::launch::async, &async_cancel, &request);
std::cout << "Before MPI_Wait" << std::endl;
MPI_Wait(&request, &status);
std::cout << "After MPI_Wait " << std::endl;
}
else
std::this_thread::sleep_for(2s);
MPI_Finalize();
return 0;
}
The expected result is:
Before MPI_Wait
Before MPI_Cancel
After MPI_Cancel
After MPI_Wait
With MS-MPI, MPI_Wait()
does not return.
Hello! I found a bug using MS-MPI 10.1. MPI_Win_allocate_shared
fails whenever size==0
. This is explicitly supported according to the documentation:
The size argument may be different at each process and size = 0 is valid. https://docs.microsoft.com/en-us/message-passing-interface/mpi-win-allocate-shared-function
Allocating shm window: size=1 stride=1...OK.
Allocating shm window: size=0 stride=1...
job aborted:
[ranks] message
[0] fatal error
Fatal error in MPI_Win_allocate_shared: Other MPI error, error stack:
MPI_Win_allocate_shared(size=-1015819392, disp_unit=0, info=0x1, comm=0x1c000000, baseptr=0x00007FF844000000, win=0x00000097526FFD00) failed
CreateFileMapping failed, error 87
I think the issue is that under the hood, MS-MPI calls MPID_Win_create_[non]contig
, which in turn calls CreateFileMappingW
, whose documented behavior is:
An attempt to map a file with a length of 0 (zero) fails with an error code of ERROR_FILE_INVALID. Applications should test for files with a length of 0 (zero) and reject those files. https://docs.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-createfilemappingw
Further complicating things when investigating this, there is also a bug in the error message above: All the parameters are being printed in the wrong fields. size
is displaying the value of baseptr
, and all the other parameters are off by one (disp_size
is displaying size==0
, info
is displaying disp_size==1
...) I did find the code issue causing this display issue; baseptr
should be moved to second-to-last in the parameter list here:
Microsoft-MPI/src/mpi/msmpi/api/mpi_win.cpp
Line 494 in 7ff6bdc
Thank you!
--Dan Weiner, HPC Research Engineer, Convergent Science (convergecfd.com)
MPI-2.1 was published in 2008, but mpi.h still contains
#define MPI_VERSION 2
#define MPI_SUBVERSION 0
MS-MPI is portrayed at SC/MPI Forum meetings as merely lagging in MPI-3.1 implementation, but users who check the version macros get only MPI-2.0 (published in 2003). I realize that some functions from later standards are available, but these need to be tested individually if one wishes to use them while continuing to support MS-MPI.
It would be extremely valuable for the MPI community to understand whether conforming support for MPI-2.1 and later are planned, and if so, what the timeline may be.
Are there any plans on your road map to support CUDA-aware MPI?
Since one of the aims is "Ease of porting existing code that uses MPICH", I was wondering if there is a plan to be ABI compatible with MPICH similar to many other MPICH derivatives (see https://www.mpich.org/abi/)?
with OpenMPI and MPICH2 one can run any of the commands:
mpiexec --version
or
mpirun --version
to check the version. How do your that with MS MPI?
Freeing a request for an immediate send causes subsequent sends to reuse the same buffer even if the previous operation hasn't completed, leading to erroneous data being sent.
I include a reproducer program that induces the bug.
#include <assert.h>
#include <mpi.h>
#define TRIGGER_BUG
int main(void)
{
int thread_lvl = MPI_THREAD_SINGLE;
MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &thread_lvl);
assert(thread_lvl >= MPI_THREAD_MULTIPLE);
static const int one = 1;
MPI_Request req_one;
MPI_Isend(&one, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &req_one);
#ifdef TRIGGER_BUG
MPI_Request_free(&req_one);
#endif
static const int six = 6;
MPI_Request req_six;
MPI_Isend(&six, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &req_six);
#ifndef TRIGGER_BUG
MPI_Request_free(&req_one);
#endif
MPI_Request_free(&req_six);
int recv_first = 0;
MPI_Recv(&recv_first, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
int recv_second = 0;
MPI_Recv(&recv_second, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
assert(recv_first + recv_second == 7);
MPI_Finalize();
}
The typeclass parameters are declared in mpi.h,
/* typeclasses */
#define MPI_TYPECLASS_REAL 1
#define MPI_TYPECLASS_INTEGER 2
#define MPI_TYPECLASS_COMPLEX 3
but are absent in the Fortran interface module mpi.f90, so code like this fails:
use mpi, only: MPI_TYPECLASS_INTEGER
mpi.h is missing Out_writes declaration, and it results in build errors with MSMPI_NO_SAL.
In OpenMPI we can select the different algorithms for collective communication functions.
Take the Broadcast function as an example, it has the following implementations: 0 ignore, 1 basic linear, 2 chain, 3: pipeline, 4: split binary tree, 5: binary tree, 6: binomial tree, 7: knomial tree, 8: scatter_allgather, 9: scatter_allgather_ring.
And we can select the implementations by setting '--mca coll_tuned_bcast_algorithm 7'
mpirun --mca coll_tuned_use_dynamic_rules 1 --mca coll_tuned_bcast_algorithm 7 -np 2 ./myprogram
Is there a way in Microsoft MPI to do the same or to change the tuning parameters?
ref: https://www.open-mpi.org/faq/?category=tuning#setting-mca-params
This issue was encountered in one of the tests for rsmpi. A C++ version of the test can be found here: https://gist.github.com/AndrewGaspar/c75a1f0db91aa4633492c563cf107c4b
This reproduces with the latest 10.1.1 release from the GitHub Releases page.
Essentially, it appears that progress isn't always made on barriers when interleaved with point-to-point communication. This only reproduces with 3 or more ranks. On my system, this is the result of running this program with 3 ranks:
> mpiexec.exe -n 3 .\repro.exe
1: Sending 1
2: Sending 1
1: Sent 1
2: Sent 1
2: Sending 2
2: Sent 2
1: Sending 2
1: Sent 2
0: Received 1 from 1
0: Received 2 from 1
0: Received 1 from 2
0: Received 2 from 2
1: Sending 3
1: Sent 3
0: Received 3 from 1
In this specific example, you can see the immediate barrier never completes for rank 2. However, the standard (MPI 3.1) seems to indicate that collectives, including Ibarrier should be able to make progress at the same time as point to point communications.
Did this article get moved to a new url? It was a very useful tutorial example.
Cross-compiling with mingw used to work, but fails with a newer mingw
x86_64-w64-mingw32-gcc (SUSE Linux) 9.2.0
The failure messages report this:
/path/msmpi/include/mpi.h:514:1: error: expected '=', ',', ';', 'asm' or
'__attribute__' before 'MPI_Send'
MPI_Send(
^~~~~~~~
This traces back to many, many attribute expansions that appear to be specific to the microsoft compiler. Older versions of Gcc silently ignored these attributes, but they are now flagged as errors.
What is the best fix for this?
https://www.nuget.org/packages/msmpisdk/
MSMPIsdk 10.1.12498.18
10.1.12498.18 | 3 581 | 8 months ago
10.1.12498.16 | 235 | 10 months ago
https://github.com/microsoft/Microsoft-MPI/blob/master/src/mpi.props
<ProductMajorVersion>10</ProductMajorVersion>
<ProductMinorVersion>1</ProductMinorVersion>
<BuildMajorVersion>12498</BuildMajorVersion>
<BuildMinorVersion>16</BuildMinorVersion>
/cc @jithinjosepkl
could you update the version of files in the repository to the new version
Opening project results in
The function "GetType" on type "System.String" is not available for execution as an MSBuild property function.
Have you attempted to create a conda package for msmpi? I wanted to double check before trying to add a recipe myself.
Hi all.
I have a case where I'm using MPI inside a DLL file. It serves like an API which means that MPI_Finalize() must be called only once, from inside the DLL, in the end of its caller program. To accomplish this I've put MPI_Finalize() in a function that is marked with __attribute__((destructor))
, so its code will execute upon unloading of the DLL.
However, the program hangs there, and MPI_Finalize() never returns. My setup is (although the issue was reproducible in other PC as well):
Windows 10 Pro Version 21H1 (OS Build 19043.1237).
MS-MPI Version 10.1.12498.18.
Build with gcc 10.1.0 using mingw64 from MSYS2 with target x86_64-w64-mingw32.
I have managed to reproduce the issue in a minimal example:
#include <stdio.h>
#include <stdint.h>
#include <mpi.h>
void bin_destructor() __attribute__((destructor));
void bin_destructor(){
int initialized;
MPI_Initialized(&initialized);
int finalized;
MPI_Finalized(&finalized);
printf("*** %s. Line %d ***\n", __FILE__, __LINE__); fflush(stdout);
if(initialized && (!finalized)){
printf("*** %s. Line %d, Finalizing MPI...", __FILE__, __LINE__); fflush(stdout);
MPI_Finalize(); // <-- It hangs here
printf("Done. ***\n"); fflush(stdout);
}
printf("*** %s. Line %d ***\n", __FILE__, __LINE__); fflush(stdout);
}
extern "C" __stdcall __declspec(dllexport) int MyAPI(const int n){
int initialized;
MPI_Initialized(&initialized);
if( !initialized ) MPI_Init(NULL, NULL); // Initialize only once.
printf("*** %s. Line %d n=%d ***\n", __FILE__, __LINE__, n); fflush(stdout);
return 0;
}
compiling with
g++ -shared -o libmyapi.dll -Wall -fPIC -IC:\SourceCode\MPI\Include myapi.cpp C:\Windows\System32\msmpi.dll
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
extern "C" int MyAPI(const int n);
int main(int argc, char** argv){
printf("*** %s. Line %d ***\n", __FILE__, __LINE__); fflush(stdout);
MyAPI(1);
printf("*** %s. Line %d ***\n", __FILE__, __LINE__); fflush(stdout);
MyAPI(2);
printf("*** %s. Line %d ***\n", __FILE__, __LINE__); fflush(stdout);
return 0;
}
compiling with
g++ -o main.exe -Wall main.cpp -L. -llibmyapi
When I run the program serially, e.g. $>main.exe , the output is:
*** main.cpp. Line 8 ***
*** myapi.cpp. Line 26 n=1 ***
*** main.cpp. Line 10 ***
*** myapi.cpp. Line 26 n=2 ***
*** main.cpp. Line 12 ***
*** myapi.cpp. Line 13 ***
*** myapi.cpp. Line 15, Finalizing MPI...Done. ***
*** myapi.cpp. Line 19 ***
When I run the program in parallel, even with only 1 process, e.g. $>mpiexec -np 1 main.exe , the output is:
*** main.cpp. Line 8 ***
*** myapi.cpp. Line 26 n=1 ***
*** main.cpp. Line 10 ***
*** myapi.cpp. Line 26 n=2 ***
*** main.cpp. Line 12 ***
*** myapi.cpp. Line 13 ***
*** myapi.cpp. Line 15, Finalizing MPI...
..and it hangs there.
Any help so I can explain & resolve this behavior is greatly appreciated.
Best regards,
George
Currently building MSMPI requires the WDK to be installed. As far as I can tell it is only required so that the MessageCompile
command can be used in the project files. If that is true: would you accept a MR to replace using MessageCompile
by some custom build command that directly calls mc.exe
with the required arguments? As far as I know mc.exe
is part of the Windows SDK and not the WDK, so this should be possible.
Not having to install the driver development kit only to build MSMPI would be especially interesting for builds in automated environments (e.g. vcpkg
).
edit: I have added a simple patch in a MR to build MSMPI form source in vcpkg, that replaces the MessageCompile
command by a simple CustomBuild
command and allows to build MSMPI without the WDK being installed. See https://github.com/microsoft/vcpkg/blob/94aa80000a5e8770a1442990b2682d53156641a6/ports/msmpi/no-wdk.patch. Let me know if you would be interested in a MR proposing the same changes directly to MSMPI.
Maybe the information is buried somewhere, but I couldn't find it... Anyways, there should be a small notice on the main page, which says what version of the MPI standard can be used with this implementation!
requesting that MPI 3.x Fortran API be made available--even if not the post MPI 2.1 features, at least the mpi_f08
interface as that avoids numerous compiler warnings.
use mpi_f08, only : mpi_init, mpi_send
There are libraries that support being built for Windows on ARM64 require MSMPI. For instance parmetis
Please provide the MSMPI runtime libraries built for ARM64.
Thank you,
Error MSB8020: The build tools for WindowsUserModeDriver10.0 (Platform Toolset = 'WindowsUserModeDriver10.0') cannot be found
C:\Program Files (x86)\Windows Kits\10\Vsix\VS2019
install WDK.vsix
BuildArchitecture i386
work
BuildArchitecture amd64
failed
gfortran - x64
edit Directory.Build.props
<PropertyGroup>
<GFORTRAN_BIN>"E:\tools\vcpkg\downloads\tools\msys2\fortran\bin"</GFORTRAN_BIN>
</PropertyGroup>
<PropertyGroup>
<OutputPath>$(StagingOutputRootPath)$(MSBuildProjectName)\</OutputPath>
<VCToolsVersion>14.25.28610</VCToolsVersion>
<WindowsTargetPlatformVersion>10.0.19582.0</WindowsTargetPlatformVersion>
<PlatformToolset>v142</PlatformToolset>
<OutDir>$(OutputPath)</OutDir>
<O>$(Platform)\$(Configuration)</O>
<O Condition="'$(Platform)' == 'x86'">$(Configuration)</O>
<ResolveNugetpackages>false</ResolveNugetpackages>
<IntDirSharingDetected>None</IntDirSharingDetected>
</PropertyGroup>
call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" -vcvars_ver=14.2 amd64
Microsoft-MPI>msbuild /p:Configuration=Release /t:Clean;ReBuild
msmpi.def
x64 ?
"E:\temp\Microsoft-MPI\dirs.proj" (целевой объект Clean;ReBuild ) (1) ->
"E:\temp\Microsoft-MPI\src\dirs.proj" (целевой объект по умолчанию) (2:2) ->
"E:\temp\Microsoft-MPI\src\mpi\dirs.proj" (целевой объект по умолчанию) (10:2) ->
"E:\temp\Microsoft-MPI\src\mpi\msmpi\dirs.proj" (целевой объект по умолчанию) (13:2) ->
"E:\temp\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj" (целевой объект по умолчанию) (16:2) ->
(Целевой объект Link) ->
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Register_datarep єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [
E:\temp\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Abort єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\temp\Mic
rosoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Get_processor_name єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll)
[E:\temp\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Get_address єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\te
mp\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Free_mem єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\temp\
Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Alloc_mem єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\temp
\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Wtime єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\temp\Mic
rosoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
msmpi.lib(msmpi.dll) : error LNK2005: MPI_Wtick єцх юяЁхфхыхэ т msmpi.lib(msmpi.dll) [E:\temp\Mic
rosoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
...
E:\temp\Microsoft-MPI\out\Release-x64\msmpi\msmpi.dll : fatal error LNK1169: юсэрЁєцхэ ьэюуюъЁрЄэ
ю юяЁхфхыхээ√щ ёшьтюы - юфшэ шыш сюыхх [E:\temp\Microsoft-MPI\src\mpi\msmpi\dll\msmpi.vcxproj]
Предупреждений: 2
Ошибок: 107
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.