Comments (29)
Originally by Ayer, Timothy C. on 2008-08-04 09:32:36 -0500
This message has 1 attachment(s)
from mpich.
Originally by Ayer, Timothy C. on 2008-08-04 09:32:36 -0500
Attachment added: fpi.f
(2.4 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-04 09:58:49 -0500
Hi,
If you are running your executable from a shared network drive you need
to map (see "--map" option of mpiexec in the window's developer's guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the "-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Type:
bug
Status: new | Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs. I
have launched the smpd process using smpd -d but see no output from the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want me
to
send debug output.
Thanks,
Tim
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
from mpich.
Originally by Jayesh Krishna on 2008-08-04 09:58:49 -0500
Attachment added: part0001.html
(4.1 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-04 13:01:41 -0500
Attachment added: part0001.2.html
(10.6 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-04 13:01:41 -0500
You should try,
mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
file://hosta/temp/fpi.exe
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have access
to drives shared by users)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\hostA\temp\fpi.exe, that is, you could type it directly into a command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path -
why do I need to map a drive?
I am familiar with the machines file - I was just using the command line
for debugging.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you need
to map (see "--map" option of mpiexec in the window's developer's guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the "-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Type:
bug
Status: new | Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs. I
have launched the smpd process using smpd -d but see no output from the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want me
to
send debug output.
Thanks,
Tim
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 13:26:47 -0500
Hello,
Was there actually a bug that has been fixed? ...so I should download
1.1a1, the pre-release version?
I had sent some smpd -d output to Jayesh Krishna on 8/5/2008 but did not
hear back.
Thanks for your help.
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:16 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
Type: bug | Status:
closed
Priority: major | Component:
mpich2
Resolution: fixed | Keywords:
------------------------------------------------------------+---------------
Changes (by thakur):
- status: new => closed
- resolution: => fixed
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:4
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 13:32:48 -0500
Sorry, I did read that message I was just a little surprised. Thank you.
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:27 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
Type: bug | Status:
closed
Priority: major | Component:
mpich2
Resolution: fixed | Keywords:
------------------------------------------------------------+---------------
Comment (by Ayer, Timothy C.):
Hello,
Was there actually a bug that has been fixed? ...so I should download
1.1a1, the pre-release version?
I had sent some smpd -d output to Jayesh Krishna on 8/5/2008 but did not
hear back.
Thanks for your help.
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:16 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
Type: bug | Status:
closed
Priority: major | Component:
mpich2
Resolution: fixed | Keywords:
------------------------------------------------------------+---------------
Changes (by thakur):
-
status: new => closed
-
resolution: => fixed
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:4
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Jayesh Krishna on 2008-08-13 13:58:26 -0500
Attachment added: part0001.3.html
(31.6 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-13 13:58:26 -0500
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
Can you try compiling icpi.c (MPICH2\examples) and run the program in
your setup (Make sure that the problem is not related to fortran
bindings).
I have seen that some times that the uninstall/install of MPICH2 does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
dir c:\windows\system32\mpich2_.dll
dir c:\windows\system32\mpe_.dll
Send us the results for verification (Sanity check- they should have the
same datestamp)
Also when running fpi.exe using your setup try leaving the job (or may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the output
from the mpiexec just so you can see what I typed.
H:>mpiexec.exe -map v:\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place during
MPI_Bcast() (not before that in fpi.f).
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was the
firewall as well.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Do you have windows firewall (or any firewall) running on these machines
?
Why do I see two inputs (10 & 100) in the mpiexec debug output ?
Can you send us the debug output of smpd along with mpiexec ?
Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you modified
the program ?)?
I am assuming that the setup is not heterogeneous (MPICH2 currently does
not support running jobs across machines with different data models eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
Please provide us with the debug/verbose output when running fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances of
smpd running on the system, smpd -stop 2. Start smpd in debug mode, smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that referring
to?
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec ?
Pls provide us with the output of the following commands (Make sure that
you specify ipaddresses of the hosts involved),
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's should/have
worked regardless of whether a user a user is logged in. We have never
relied on drive network drive mappings since they are intermittently an
"interactive" feature.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
file://hosta/temp/fpi.exe
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have access
to drives shared by users)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\hostA\temp\fpi.exe, that is, you could type it directly into a command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path -
why do I need to map a drive?
I am familiar with the machines file - I was just using the command line
for debugging.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you need
to map (see "--map" option of mpiexec in the window's developer's guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the "-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Type:
bug
Status: new | Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs. I
have launched the smpd process using smpd -d but see no output from the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want me
to
send debug output.
Thanks,
Tim
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
from mpich.
Originally by Rajeev Thakur on 2008-08-13 14:10:11 -0500
Tim,
We have a new bug tracking system (Trac) that I am not fully familiar
with. I was going through the list trying to close ones I thought
(mistakenly or otherwise) needed no further action. I didn't know that it
also sent a note to the sender :-). Jayesh will follow up with you further
on this issue.
Rajeev
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 14:11:09 -0500
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the dates
on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)??? ...I should
have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+---------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
Can you try compiling icpi.c (MPICH2\examples) and run the program in
your setup (Make sure that the problem is not related to fortran
bindings).
I have seen that some times that the uninstall/install of MPICH2 does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
dir c:\windows\system32\mpich2_.dll
dir c:\windows\system32\mpe_.dll
Send us the results for verification (Sanity check- they should have the
same datestamp)
Also when running fpi.exe using your setup try leaving the job (or may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the output
from the mpiexec just so you can see what I typed.
H:>mpiexec.exe -map v:\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place during
MPI_Bcast() (not before that in fpi.f).
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was the
firewall as well.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Do you have windows firewall (or any firewall) running on these machines
?
Why do I see two inputs (10 & 100) in the mpiexec debug output ?
Can you send us the debug output of smpd along with mpiexec ?
Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you modified
the program ?)?
I am assuming that the setup is not heterogeneous (MPICH2 currently does
not support running jobs across machines with different data models eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
Please provide us with the debug/verbose output when running fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances of
smpd running on the system, smpd -stop 2. Start smpd in debug mode, smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that referring
to?
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec ?
Pls provide us with the output of the following commands (Make sure that
you specify ipaddresses of the hosts involved),
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's should/have
worked regardless of whether a user a user is logged in. We have never
relied on drive network drive mappings since they are intermittently an
"interactive" feature.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
file://hosta/temp/fpi.exe
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have access
to drives shared by users)
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\hostA\temp\fpi.exe, that is, you could type it directly into a command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path -
why do I need to map a drive?
I am familiar with the machines file - I was just using the command line
for debugging.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you need
to map (see "--map" option of mpiexec in the window's developer's guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the "-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Type:
bug
Status: new | Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs. I
have launched the smpd process using smpd -d but see no output from the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want me
to
send debug output.
Thanks,
Tim
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 14:13:57 -0500
Thanks for letting me know. I knew something was up...this explains it :)
...no worries.
Jayesh and I are currently "discussing" it. ;)
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:10 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+---------------
Comment (by Rajeev Thakur):
Tim,
We have a new bug tracking system (Trac) that I am not fully familiar
with. I was going through the list trying to close ones I thought
(mistakenly or otherwise) needed no further action. I didn't know that it
also sent a note to the sender :-). Jayesh will follow up with you further
on this issue.
Rajeev
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Jayesh Krishna on 2008-08-13 14:37:57 -0500
Attachment added: part0001.4.html
(26.6 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-13 14:37:57 -0500
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
sshm.dll s should have the same datestamp as other dlls (should not be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory (Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll)
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
Can you try compiling icpi.c (MPICH2\examples) and run the program in
your setup (Make sure that the problem is not related to fortran
bindings).
I have seen that some times that the uninstall/install of MPICH2 does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
dir c:\windows\system32\mpich2_.dll
dir c:\windows\system32\mpe_.dll
Send us the results for verification (Sanity check- they should have
the
same datestamp)
Also when running fpi.exe using your setup try leaving the job (or may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:>mpiexec.exe -map v:\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Do you have windows firewall (or any firewall) running on these
machines
?
Why do I see two inputs (10 & 100) in the mpiexec debug output ?
Can you send us the debug output of smpd along with mpiexec ?
Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
Please provide us with the debug/verbose output when running fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec ?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's should/have
worked regardless of whether a user a user is logged in. We have never
relied on drive network drive mappings since they are intermittently an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
file://hosta/temp/fpi.exe
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\hostA\temp\fpi.exe, that is, you could type it directly into a command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path
why do I need to map a drive?
I am familiar with the machines file - I was just using the command line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Type:
bug
Status: new | Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs.
I
have launched the smpd process using smpd -d but see no output from the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want me
to
send debug output.
Thanks,
Tim
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Jayesh Krishna on 2008-08-13 14:50:00 -0500
Hi,
I spoke too soon. We have discontinued supporting sshm channel and that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory (Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
Can you try compiling icpi.c (MPICH2\examples) and run the program in
your setup (Make sure that the problem is not related to fortran
bindings).
I have seen that some times that the uninstall/install of MPICH2 does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
dir c:\windows\system32\mpich2_.dll
dir c:\windows\system32\mpe_.dll
Send us the results for verification (Sanity check- they should have
the
same datestamp)
Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:>mpiexec.exe -map v:\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Do you have windows firewall (or any firewall) running on these
machines
?
Why do I see two inputs (10 & 100) in the mpiexec debug output ?
Can you send us the debug output of smpd along with mpiexec ?
Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
Please provide us with the debug/verbose output when running fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec ?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
mpiexec.exe -map y:\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have never
relied on drive network drive mappings since they are intermittently an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
file://hosta/temp/fpi.exe
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive
is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\hostA\temp -hosts 2 hostA hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Jayesh Krishna on 2008-08-13 14:50:00 -0500
Attachment added: part0001.5.html
(30.8 KiB)
Added by email2trac
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 14:53:03 -0500
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+---------------
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory (Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the program in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2 does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec ?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have never
relied on drive network drive mappings since they are intermittently an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe>
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive
is
accessible/mapped by the user logged on to the machines. SMPD runs as a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB
\\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC path
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast). It
does prompt for input (number of intervals) but once entered it hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36>
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Jayesh Krishna on 2008-08-13 15:01:56 -0500
Attachment added: part0001.6.html
(39.5 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-08-13 15:01:56 -0500
Hi,
I just cross-verified the timestamps of the dlls and they look alright.
Make sure that you have the date/timestamps right on all the hosts
involved.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:53 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and
that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not
be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory
(Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll
<<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll
<<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the program
in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2
does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the
right
dlls try listing the MPICH2 dlls in your windows system32 directory
on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should
have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the
MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted
to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models
eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running
fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances
of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec
?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have
never
relied on drive network drive mappings since they are intermittently
an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe>
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive
is
accessible/mapped by the user logged on to the machines. SMPD runs as
a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is
a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH
(MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA
hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC
path
-
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number
of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast).
It
does prompt for input (number of intervals) but once entered it
hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36>
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 15:08:53 -0500
Will do.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Wednesday, August 13, 2008 4:02 PM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
I just cross-verified the timestamps of the dlls and they look alright.
Make sure that you have the date/timestamps right on all the hosts involved.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]
mailto:[email protected] ] On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:53 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+---------------
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory
(Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the program
in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2
does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the
MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted
to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models
eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running
fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances
of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec
?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have
never
relied on drive network drive mappings since they are intermittently
an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe <file://hosta/temp/fpi.exe> >
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive
is
accessible/mapped by the user logged on to the machines. SMPD runs as
a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA
hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC
path
-
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast).
It
does prompt for input (number of intervals) but once entered it
hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36
https://trac.mcs.anl.gov/projects/mpich2/ticket/36 >
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Ayer, Timothy C. on 2008-08-13 15:08:54 -0500
Attachment added: part0001.7.html
(39.7 KiB)
Added by email2trac
from mpich.
Originally by Ayer, Timothy C. on 2008-09-11 12:08:42 -0500
Jayesh,
I apologize for the delay. I hope to get back to this soon but other items
have taken higher priority.
Thanks,
Tim
From: Ayer, Timothy C.
Sent: Wednesday, August 13, 2008 4:10 PM
To: Jayesh Krishna; Ayer, Timothy C.
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Will do.
From: Jayesh Krishna [mailto:[email protected]]
Sent: Wednesday, August 13, 2008 4:02 PM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
I just cross-verified the timestamps of the dlls and they look alright.
Make sure that you have the date/timestamps right on all the hosts involved.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]
mailto:[email protected] ] On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:53 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected] [mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+---------------
Reporter: "Ayer, Timothy C." [email protected] | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+---------------
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- mailto:owner-
[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not be
from 2005!).
Please try the following,
Uninstall MPICH2 on the hosts involved in your job.
Manually delete the MPICH2 dlls from windows\system32 directory
(Please
be careful! Make sure that you delete only mpich2_.dll & mpe_.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll <<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll <<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]
mailto:[email protected] ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
Reporter: "Ayer, Timothy C." [email protected] |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the program
in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2
does
not result in the dlls being updated correctly (This has lead to some
wierd-difficult-to-debug hangs in our tests. This is not usual but it
does
not hurt to check for it though). To make sure that you have the right
dlls try listing the MPICH2 dlls in your windows system32 directory on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the
MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted
to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two 32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2 currently
does
not support running jobs across machines with different data models
eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running
fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances
of
smpd running on the system, smpd -stop 2. Start smpd in debug mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
What is the error message (output) that you get when you run mpiexec
?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have
never
relied on drive network drive mappings since they are intermittently
an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe <file://hosta/temp/fpi.exe> >
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the drive
is
accessible/mapped by the user logged on to the machines. SMPD runs as
a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH (MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA
hostB
\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Thanks Jayesh for the quick reply. This is a network availabe UNC
path
-
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]
mailto:[email protected] ]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." [email protected] |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast).
It
does prompt for input (number of intervals) but once entered it
hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36
https://trac.mcs.anl.gov/projects/mpich2/ticket/36 >
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL:
https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
--
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
Ticket URL: https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:
from mpich.
Originally by Ayer, Timothy C. on 2008-09-11 12:08:42 -0500
Attachment added: part0001.8.html
(40.9 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-10-23 16:24:38 -0500
Attachment added: part0001.9.html
(47.2 KiB)
Added by email2trac
from mpich.
Originally by Jayesh Krishna on 2008-10-23 16:24:38 -0500
Hi,
Did you get a chance to look at the setup ?
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]]
On Behalf Of mpich2
Sent: Thursday, September 11, 2008 12:09 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> | Owner:
jayesh
Type: bug | Status:
assigned
Priority: major | Component:
mpich2
Resolution: | Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Jayesh,
I apologize for the delay. I hope to get back to this soon but other
items have taken higher priority.
Thanks,
Tim
_____
From: Ayer, Timothy C.
Sent: Wednesday, August 13, 2008 4:10 PM
To: Jayesh Krishna; Ayer, Timothy C.
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Will do.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Wednesday, August 13, 2008 4:02 PM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
I just cross-verified the timestamps of the dlls and they look alright.
Make sure that you have the date/timestamps right on all the hosts
involved.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:[email protected]
<mailto:[email protected]> ] On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:53 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
--
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
--
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and
that
is the reason that you have an old version of sshm related dlls in your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should not
be
from 2005!).
Please try the following,
# Uninstall MPICH2 on the hosts involved in your job.
# Manually delete the MPICH2 dlls from windows\system32 directory
(Please
be careful! Make sure that you delete only mpich2*.dll & mpe*.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
# Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and slow
response).
Also here is the output you requested. I have been wondering why the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from 2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll
<<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll
<<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-------------
--
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-------------
--
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the program
in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2
does
not result in the dlls being updated correctly (This has lead to
some
wierd-difficult-to-debug hangs in our tests. This is not usual but
it
does
not hurt to check for it though). To make sure that you have the
right
dlls try listing the MPICH2 dlls in your windows system32 directory
on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should
have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job (or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if
it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw the
MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
<http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=dow
>
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I wanted
to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's
for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two
32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2
currently
does
not support running jobs across machines with different data models
eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running
fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any instances
of
smpd running on the system, smpd -stop 2. Start smpd in debug
mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Thanks, here is the output (note: I have not included IP address or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
What is the error message (output) that you get when you run
mpiexec ?
Pls provide us with the output of the following commands (Make sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have
never
relied on drive network drive mappings since they are intermittently
an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe <file://hosta/temp/fpi.exe> >
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the
drive
is
accessible/mapped by the user logged on to the machines. SMPD runs
as a
service logged on as "Local System" and does not - should not- have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory is
a
shared location. I am not sure physically how this is setup on our
network but this has worked with out any "mapping" for MPICH
(MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA
hostB
\\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Thanks Jayesh for the quick reply. This is a network availabe UNC
path
-
why do I need to map a drive?
I am familiar with the machines file - I was just using the command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Hi,
If you are running your executable from a shared network drive you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide
for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have
installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a number
of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in MPI_Bcast).
It
does prompt for input (number of intervals) but once entered it
hangs.
I
have launched the smpd process using smpd -d but see no output from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you
want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36> >
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
from mpich.
Originally by Ayer, Timothy C. on 2008-10-27 11:49:11 -0500
Hi Jayesh,
Thanks for checking. Unfortunately I have not. If you folks prefer to
close the ticket we can reopen once I get more information.
Tim
________________________________
From: Jayesh Krishna [mailto:[email protected]]
Sent: Thursday, October 23, 2008 5:24 PM
To: Ayer, Timothy C.
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
Did you get a chance to look at the setup ?
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of mpich2
Sent: Thursday, September 11, 2008 12:09 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner: jayesh
Type: bug |
Status: assigned
Priority: major |
Component: mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Jayesh,
I apologize for the delay. I hope to get back to this soon but other
items have taken higher priority.
Thanks,
Tim
_____
From: Ayer, Timothy C.
Sent: Wednesday, August 13, 2008 4:10 PM
To: Jayesh Krishna; Ayer, Timothy C.
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Will do.
_____
From: Jayesh Krishna [mailto:[email protected]]
Sent: Wednesday, August 13, 2008 4:02 PM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
Hi,
I just cross-verified the timestamps of the dlls and they look
alright.
Make sure that you have the date/timestamps right on all the hosts
involved.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ] On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:53 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
That's a bummer I thought for sure that must be it....oh well. I will
pursue the other two options.
Thanks,
Tim
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 3:50 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
----
Comment (by Jayesh Krishna):
Hi,
I spoke too soon. We have discontinued supporting sshm channel and
that
is the reason that you have an old version of sshm related dlls in
your
system32 directory.
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:38 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Jayesh Krishna):
Hi,
Hmmm... This looks like the problem that I mentioned in my email.
-sshm*.dll s should have the same datestamp as other dlls (should
not be
from 2005!).
Please try the following,
# Uninstall MPICH2 on the hosts involved in your job.
# Manually delete the MPICH2 dlls from windows\system32 directory
(Please
be careful! Make sure that you delete only mpich2*.dll & mpe*.dll) #
Re-install MPICH2 1.0.7 (stable version) on the hosts/nodes .
# Re-compile cpi.c/fpi.c and try running your job.
Let us know the results.
Regards,
Jayesh
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:11 PM
To: undisclosed-recipients:
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
------------------------------------------------------------+----
Comment (by Ayer, Timothy C.):
Hi Jayesh,
Great to hear from you. I will try your suggestions (icpi.c and
slow
response).
Also here is the output you requested. I have been wondering why
the
dates on mpich2sshm.dll and mpich2sshmp.dll seem so old (from
2005)???
...I should have mentioned it sooner.
Thanks,
Tim
C:\WINDOWS\system32>dir c:\windows\system32\mpe*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of c:\windows\system32
04/04/2008 05:46 PM 135,168 mpe.dll
1 File(s) 135,168 bytes
0 Dir(s) 4,497,502,208 bytes free
C:\WINDOWS\system32>
C:\WINDOWS\system32>dir dir c:\windows\system32\mpich2*.dll
Volume in drive C is System
Volume Serial Number is D8B5-0657
Directory of C:\WINDOWS\system32
Directory of C:\WINDOWS\system32
04/04/2008 05:28 PM 1,110,016 mpich2.dll
04/04/2008 05:47 PM 151,552 mpich2mpe.dll
04/04/2008 05:23 PM 159,744 mpich2mpi.dll
04/04/2008 06:31 PM 1,159,168 mpich2mt.dll
04/04/2008 06:42 PM 1,351,680 mpich2mtp.dll
04/04/2008 05:43 PM 1,306,624 mpich2p.dll
04/04/2008 05:55 PM 1,093,632 mpich2shm.dll
04/04/2008 06:03 PM 1,294,336 mpich2shmp.dll
11/23/2005 02:33 AM 1,032,192 mpich2sshm.dll
<<<<<<<<<<<<<<<<
11/23/2005 02:36 AM 1,294,336 mpich2sshmp.dll
<<<<<<<<<<<<<<<<
04/04/2008 06:14 PM 1,122,304 mpich2ssm.dll
04/04/2008 06:22 PM 1,343,488 mpich2ssmp.dll
12 File(s) 12,419,072 bytes
0 Dir(s) 4,497,502,208 bytes free
-----Original Message-----
From: [email protected]
[mailto:[email protected]
<mailto:[email protected]> ]
On Behalf Of mpich2
Sent: Wednesday, August 13, 2008 2:58 PM
Subject: Re: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
------------------------------------------------------------+-----------
--
--
Reporter: "Ayer, Timothy C." <[email protected]> |
Owner:
jayesh
Type: bug |
Status:
assigned
Priority: major |
Component:
mpich2
Resolution: |
Keywords:
------------------------------------------------------------+-----------
--
--
Comment (by Jayesh Krishna):
Hi,
The logs sent by you show that the communication btw the process
managers on the hosts is good. The problem looks to be with the
communication btw the MPI processes.
# Can you try compiling icpi.c (MPICH2\examples) and run the
program in
your setup (Make sure that the problem is not related to fortran
bindings).
# I have seen that some times that the uninstall/install of MPICH2
does
not result in the dlls being updated correctly (This has lead to
some
wierd-difficult-to-debug hangs in our tests. This is not usual but
it
does
not hurt to check for it though). To make sure that you have the
right
dlls try listing the MPICH2 dlls in your windows system32
directory on
both the hosts,
>>> dir c:\windows\system32\mpich2*.dll
>>> dir c:\windows\system32\mpe*.dll
Send us the results for verification (Sanity check- they should
have
the
same datestamp)
# Also when running fpi.exe using your setup try leaving the job
(or
may
be specify a timeout of 10 mins or so) for 10mins or so and see if
it
reports any errors. You might want to run netstat (or use "Process
explorer" from microsoft and check the TCP/IP tab in the
process->properties) to see what happens to the connections btw
the MPI
processes from both hosts.
(PS: The MPICH2 1.1.0a1 release
(http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=d
ow
<http://www.mcs.anl.gov/research/projects/mpich2/downloads/index.php?s=d
ow>
nloads) is aimed at MPICH2 devs and not for production machines. )
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Tuesday, August 05, 2008 9:20 AM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Please find attached the output from the smpd -d procs. Also, the
output
from the mpiexec just so you can see what I typed.
H:\>mpiexec.exe -map v:\\10.30.73.170\temp -hosts 2 10.30.73.170
10.30.73.34 v:\fpi.exe
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
100
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 5:10 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The socket/channel connection between the MPI processes take place
during
MPI_Bcast() (not before that in fpi.f).
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 4:00 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The firewall has been disabled.
The inputs were from me entering values for estimating pi...I
wanted to
make sure the program ran through all the logic.
I will send the other debug output a little later.
Also, as an fyi, we have been running MPICH on thousands of PC's
for
years now. The other strange part is that over a year ago I did
successfully run MPICH2 on over 30 processors. My first thought
was
the
firewall as well.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 4:46 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
# Do you have windows firewall (or any firewall) running on these
machines
?
# Why do I see two inputs (10 & 100) in the mpiexec debug output ?
# Can you send us the debug output of smpd along with mpiexec ?
# Can you check the status of the remote smpd from each host ?
--- On host A, run "smpd -status IPAddressOf_hostB"
--- On host B, run "smpd -status IPAddressOf_hostA"
(PS: I just tried running fpi.exe in a shared drive across two
32-bit
windows XP machines in our lab but did not get any errors/hang)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
This is the same fpi.f which comes with the installation with the
exception that I have added print statements.
The setup is homogenous (both 32-bit). The output is attached.
Thanks for your help.
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:48 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
# Are you running fpi.exe (fpi.f) provided with MPICH2 (Have you
modified
the program ?)?
# I am assuming that the setup is not heterogeneous (MPICH2
currently
does
not support running jobs across machines with different data
models
eg:
You cannot run your MPI job across 32-bit and 64-bit machines)
# Please provide us with the debug/verbose output when running
fpi.exe.
Start smpd on both the machines in debug mode (1. Stop any
instances of
smpd running on the system, smpd -stop 2. Start smpd in debug
mode,
smpd
-d) and run mpiexec in verbose mode (mpiexec.exe -verbose -map
y:\\IPAddressOf_hostA\temp -hosts 2 IPAddressOf_hostA
IPAddressOf_hostB
y:\fpi.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:21 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Thanks, here is the output (note: I have not included IP address
or
actual hostnames in this email but did use them in testing)
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
OUTPUT:
Process 0 of 2 is alive
Enter the number of intervals: (0 quits)
Process 1 of 2 is alive
Before bcast 1 of 2 is alive
10
Before bcast 0 of 2 is alive
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
XXXXXX (hostname of hostA)
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 3:13 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The command hostname (c:\windows\system32\hostname.exe)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:11 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
You have "hostname" at the end of the second line...what is that
referring
to?
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:47 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
What is the error message (output) that you get when you run
mpiexec ?
Pls provide us with the output of the following commands (Make
sure
that
you specify ipaddresses of the hosts involved),
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp -hosts 2
IPAddressOf_hostA
IPAddressOf_hostB y:\fpi.exe
# mpiexec.exe -map y:\\IPAddressOf_hostA\temp hostname
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 1:25 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
No this does not work...the behavior is the same. The UNC's
should/have
worked regardless of whether a user a user is logged in. We have
never
relied on drive network drive mappings since they are
intermittently an
"interactive" feature.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 2:02 PM
To: 'Ayer, Timothy C.'
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
You should try,
mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA hostB y:\fpi.exe
<file://hosta/temp/fpi.exe > >
Let us know if it works for you.
(PS: The shared drive is accessible across machines because the
drive
is
accessible/mapped by the user logged on to the machines. SMPD runs
as a
service logged on as "Local System" and does not - should not-
have
access
to drives shared by users)
Regards,
Jayesh
_____
From: Ayer, Timothy C. [ <file://hosta/temp/fpi.exe
<file://hosta/temp/fpi.exe> mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 12:50 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
The exe can be directly accessed from hostB by executing
\\hostA\temp\fpi.exe, that is, you could type it directly into a
command
prompt from hostB if you wanted. Note also that \temp directory
is a
shared location. I am not sure physically how this is setup on
our
network but this has worked with out any "mapping" for MPICH
(MPICH1).
Note: I did try: mpiexec.exe -map y:\\hostA\temp -hosts 2 hostA
hostB
\\hostA\temp\fpi.exe but that still hangs in the MPI_Bcast call.
The interesting part is that it gets through the initialization:
call MPI_INIT( ierr )
call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
All execute.
Thanks,
Tim
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 1:33 PM
To: 'Ayer, Timothy C.'
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
How (what mechanism) does hostB access data (exe) in hostA ?
Regards,
Jayesh
_____
From: Ayer, Timothy C. [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 12:31 PM
To: Jayesh Krishna
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Thanks Jayesh for the quick reply. This is a network availabe UNC
path
-
why do I need to map a drive?
I am familiar with the machines file - I was just using the
command
line
for debugging.
_____
From: Jayesh Krishna [mailto:[email protected]
<mailto:[email protected]> ]
Sent: Monday, August 04, 2008 10:56 AM
To: [email protected]
Cc: [email protected]
Subject: RE: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows
XP
Hi,
If you are running your executable from a shared network drive
you
need
to map (see "--map" option of mpiexec in the window's developer's
guide)
the network drive with mpiexec when launching your job.
Also make sure that you have turned the windows firewall (or any
other
firewalls) off on the machines involved in the job.
Try specifying the ip addresses of the machines instead of the
hostnames.
Let us know the results.
(PS: Instead of the "-hosts" option you could try using the
"-machinefile"
option available with mpiexec. See the window's developer's guide
for
details.)
Regards,
Jayesh
-----Original Message-----
From: [email protected] [mailto:owner- <mailto:owner->
[email protected]]
On Behalf Of mpich2
Sent: Monday, August 04, 2008 9:33 AM
To: undisclosed-recipients:
Subject: [mpich2-maint] #36: MPICH2 fpi.exe hanging on Windows XP
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
Reporter: "Ayer, Timothy C." <[email protected]> |
Type:
bug
Status: new |
Priority:
major
Component: mpich2 |
-----------------------------------------------------------+------------
-----------------------------------------------------------+----
I am testing MPICH2 MPICH2-1.0.7 Windows XP (sp2). I have
installed
it
on
2
hosts (hostA, hostB) and trying to run the fpi.exe built with
fmpich2.lib.
The code is hanging in a MPI_Bcast call. The fpi.exe source is
attached.
The following tests work fine from hostA, both prompt for a
number of
intervals, accept input, and produce and estimate of PI
mpiexec.exe -hosts 2 hostA hostA \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
mpiexec.exe -hosts 2 hostB hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
The following test hangs when submitted from hostA (in
MPI_Bcast).
It
does prompt for input (number of intervals) but once entered it
hangs.
I
have launched the smpd process using smpd -d but see no output
from
the
smpd after I enter an interval value
mpiexec.exe -hosts 2 hostA hostB \\hostA\temp\fpi.exe
<\\hostA\temp\fpi.exe>
Any suggestions would be appreciated. Also let me know if you
want
me
to
send debug output.
Thanks,
Tim
_____________________
Timothy C. Ayer
High Performance Technical Computing
United Technologies - Pratt & Whitney
[email protected]
(860) 565 - 5268 v
(860) 565 - 2668 f
<<fpi.f>>
--
Ticket URL: <https://trac.mcs.anl.gov/projects/mpich2/ticket/36
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36> >
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
--
Ticket URL:
<https://trac.mcs.anl.gov/projects/mpich2/ticket/36#comment:>
from mpich.
Originally by Ayer, Timothy C. on 2008-10-27 11:49:11 -0500
Attachment added: part0001.10.html
(48.2 KiB)
Added by email2trac
from mpich.
Originally by jayesh on 2008-10-27 12:08:43 -0500
Closing the ticket for now - reopen when user provides more information
-Jayesh
from mpich.
Related Issues (20)
- ABI: `MPI_THREAD_XXX` are enum values, cannot be used within the C preprocessor. HOT 1
- ABI: How to handle optional datatypes? HOT 7
- comm: Hang in MPI_Intercomm_merge after split HOT 2
- PMI.md advertises a `--with-pmilib` value not documented in its synposis HOT 4
- Inter-node MPI_Get on GPU buffer hangs HOT 5
- Misleading help for configure option
- Erroneous MPI_Reduce Results with Data Sets larger than 2KB in Multi-Server Configurations HOT 7
- Spack MPICH build error when +vci HOT 6
- shm: MPI_Win_create + MPI_Win_shared_query does not work HOT 3
- build: mpiexec 4.1.2 error when using NAG Fortran HOT 3
- comm: Segmentation fault in MPI_Intercomm_create_from_groups HOT 1
- MPICHLIB_LDFLAGS not supported? HOT 17
- build: Build embedded libfabric as a shared library HOT 5
- MPICH building error: simple/lib: No such file or directory HOT 2
- build: Add configure option to not build/install html and man pages
- f08: MPI_SUBARRAYS_SUPPORTED corner cases need to be error-checked HOT 1
- More verbose error message when the request pools run out of space HOT 1
- File descriptors required for multiple VCIs HOT 1
- PMI error when running on SDSC Expanse HOT 3
- romio: daos: implement auto-detection
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mpich.