In 1.2.33 I have been unable to run step 100 with 4 tmux processes. All but one such process always eventually seg faults in a call to save_fits in COREMOD_iofits.c. save_fits eventually leads to a call to fits_create_file, which returns fptr = 0, though no error is indicated in FITSIO_status. This has happened in several contexts with several file names.
Examining the file names, it's pretty apparent that multiple tmux processes are often trying to write to the same file name (there is no process ID in the passed file name). This is clearly not thread safe. I suppose that the files being written by each thread with the same file names are the same, but if one thread has the filename open for writing while another thread tries to open the same file, I'd expect a problem.
I'm currently trying the solution of having save_fits call save_fits_atomic, which is thread safe. This requires a little care with the filename, so the version of save_fits I'm trying is
int save_fits(char *ID_name, char *file_name)
{
char savename[1000];
if (file_name[0] == '!')
strcpy(savename, file_name+1); // avoid the leading '!'
else
strcpy(savename, file_name);
return save_fits_atomic(ID_name, savename);
}
This seems to be working, though I won't know for sure until step 100 is complete.
Edit: the above fix is working: I have completed 4 cases of steps 100 and beyond, and have completed one case of steps 0 through 15 with this change.
This change makes the following two assumptions:
Here are a couple example core dumps using the old version of save_fits.
Steves-Mac-Pro:tests steve$ lldb --core /Cores/core.59757
(lldb) target create --core "/Cores/core.59757"
warning: (x86_64) /Cores/core.59757 load command 235 LC_SEGMENT_64 has a fileoff + filesize (0xf2f8a000) that extends beyond the end of the file (0xf2f89000), the segment will be truncated to match
warning: (x86_64) /Cores/core.59757 load command 236 LC_SEGMENT_64 has a fileoff (0xf2f8a000) that extends beyond the end of the file (0xf2f89000), ignoring this section
Core file '/Cores/core.59757' (x86_64) was loaded.
(lldb) di -l
PIAACMCdesign was compiled with optimization - stepping may behave oddly; variables may not be available.
PIAACMCdesign`ffcrim + 62 at putkey.c:32
31
-> 32 if (fptr->HDUposition != (fptr->Fptr)->curhdu)
33 ffmahd(fptr, (fptr->HDUposition) + 1, NULL, status);
PIAACMCdesign`ffcrim:
-> 0x1018bef3e <+62>: movl (%r12), %esi
(lldb) bt
* thread #1: tid = 0x0000, 0x00000001018bef3e PIAACMCdesign`ffcrim(fptr=0x0000000000000000, bitpix=-32, naxis=3, naxes=0x00007fff5e410ff0, status=0x0000000109a7e218) + 62 at putkey.c:32, stop reason = signal SIGSTOP
* frame #0: 0x00000001018bef3e PIAACMCdesign`ffcrim(fptr=0x0000000000000000, bitpix=-32, naxis=3, naxes=0x00007fff5e410ff0, status=0x0000000109a7e218) + 62 at putkey.c:32 [opt]
frame #1: 0x00000001017ef89f PIAACMCdesign`save_fl_fits(ID_name="psfi0", file_name="!piaacmcconf_i000/psfi0_pt002.fits") + 799 at COREMOD_iofits.c:1039
frame #2: 0x00000001017f0e1b PIAACMCdesign`save_fits(ID_name="psfi0", file_name="!piaacmcconf_i000/psfi0_pt002.fits") + 123 at COREMOD_iofits.c:1416
(lldb) p fptr
(fitsfile *) $0 = 0x0000000000000000
(lldb) up
frame #1: 0x00000001017ef89f PIAACMCdesign`save_fl_fits(ID_name="psfi0", file_name="!piaacmcconf_i000/psfi0_pt002.fits") + 799 at COREMOD_iofits.c:1039
1036 }
1037 }
1038
-> 1039 fits_create_img(fptr, FLOAT_IMG, (int) naxis, naxes, &FITSIO_status);
1040 if(check_FITSIO_status(__FILE__,__func__,__LINE__,1)!=0)
1041 {
1042 fprintf(stderr, "%c[%d;%dm Error while calling \"fits_create_img\" %c[%d;m\n", (char) 27, 1, 31, (char) 27, 0);
(lldb) p fptr
(fitsfile *) $1 = 0x0000000000000000
(lldb) up
frame #2: 0x00000001017f0e1b PIAACMCdesign`save_fits(ID_name="psfi0", file_name="!piaacmcconf_i000/psfi0_pt002.fits") + 123 at COREMOD_iofits.c:1416
1413 atype = data.image[ID].md[0].atype;
1414 switch(atype) {
1415 case FLOAT:
-> 1416 save_fl_fits(ID_name, file_name);
1417 break;
1418 case DOUBLE:
1419 save_db_fits(ID_name, file_name);
(lldb) p ID_name
(char *) $2 = 0x0000000101908555 "psfi0"
(lldb) p file_name
(char *) $3 = 0x00007fff5e411500 "!piaacmcconf_i000/psfi0_pt002.fits"
Steves-Mac-Pro:tests steve$ lldb --core /Cores/core.60308
(lldb) target create --core "/Cores/core.60308"
warning: (x86_64) /Cores/core.60308 load command 183 LC_SEGMENT_64 has a fileoff + filesize (0xa5d0b000) that extends beyond the end of the file (0xa5d0a000), the segment will be truncated to match
warning: (x86_64) /Cores/core.60308 load command 184 LC_SEGMENT_64 has a fileoff (0xa5d0b000) that extends beyond the end of the file (0xa5d0a000), ignoring this section
Core file '/Cores/core.60308' (x86_64) was loaded.
(lldb) bt
PIAACMCdesign was compiled with optimization - stepping may behave oddly; variables may not be available.
* thread #1: tid = 0x0000, 0x000000010dc2bf3e PIAACMCdesign`ffcrim(fptr=0x0000000000000000, bitpix=-64, naxis=2, naxes=0x00007fff520a3e10, status=0x0000000115deb218) + 62 at putkey.c:32, stop reason = signal SIGSTOP
* frame #0: 0x000000010dc2bf3e PIAACMCdesign`ffcrim(fptr=0x0000000000000000, bitpix=-64, naxis=2, naxes=0x00007fff520a3e10, status=0x0000000115deb218) + 62 at putkey.c:32 [opt]
frame #1: 0x000000010db5be62 PIAACMCdesign`save_db_fits(ID_name="fpmza", file_name="!piaacmcconf_i000/fpm_zonea_s2_l0565_sr10_nbr032_mr300_minsag-02000_maxsag002000_fpmreg001000_ssr50_ssm0_Mirror_wb10.fits") + 322 at COREMOD_iofits.c:880
frame #2: 0x000000010db5de3b PIAACMCdesign`save_fits(ID_name="fpmza", file_name="!piaacmcconf_i000/fpm_zonea_s2_l0565_sr10_nbr032_mr300_minsag-02000_maxsag002000_fpmreg001000_ssr50_ssm0_Mirror_wb10.fits") + 155 at COREMOD_iofits.c:1419
(lldb) p fptr
(fitsfile *) $0 = 0x0000000000000000
(lldb) up
frame #1: 0x000000010db5be62 PIAACMCdesign`save_db_fits(ID_name="fpmza", file_name="!piaacmcconf_i000/fpm_zonea_s2_l0565_sr10_nbr032_mr300_minsag-02000_maxsag002000_fpmreg001000_ssr50_ssm0_Mirror_wb10.fits") + 322 at COREMOD_iofits.c:880
877 }
878
879
-> 880 fits_create_img(fptr, DOUBLE_IMG, naxis, naxes, &FITSIO_status);
881 if(check_FITSIO_status(__FILE__,__func__,__LINE__,1) != 0)
882 {
883 fprintf(stderr, "%c[%d;%dm Error while calling \"fits_create_img\" %c[%d;m\n", (char) 27, 1, 31, (char) 27, 0);
(lldb) p fptr
(fitsfile *) $1 = 0x0000000000000000
(lldb) p FITSIO_status
(int) $4 = 0
(lldb) up
frame #2: 0x000000010db5de3b PIAACMCdesign`save_fits(ID_name="fpmza", file_name="!piaacmcconf_i000/fpm_zonea_s2_l0565_sr10_nbr032_mr300_minsag-02000_maxsag002000_fpmreg001000_ssr50_ssm0_Mirror_wb10.fits") + 155 at COREMOD_iofits.c:1419
1416 save_fl_fits(ID_name, file_name);
1417 break;
1418 case DOUBLE:
-> 1419 save_db_fits(ID_name, file_name);
1420 break;
1421 case USHORT:
1422 save_sh_fits(ID_name, file_name);
(lldb) p ID_name
(char *) $2 = 0x000000010dc7503a "fpmza"
(lldb) p file_name
(char *) $3 = 0x00007fff520a4300 "!piaacmcconf_i000/fpm_zonea_s2_l0565_sr10_nbr032_mr300_minsag-02000_maxsag002000_fpmreg001000_ssr50_ssm0_Mirror_wb10.fits"