Comments (14)
Thanks for opening this issue @jedwards4b . I have noticed this as well, mainly / only with the intel compiler (gnu build times are fast – or at least, they were a year ago, when I noticed this issue with intel build times).
I noticed this got worse about a year ago:
-
From cism2_1_78: CISM build time was 182 seconds
-
From cismwrap_2_1_79: CISM build time was 378 seconds
from cism.
@jedwards4b and @billsacks, thanks for looking at this. Adding @Katetc to the thread. I'd very much like to identify and fix the problem. I usually use the gnu compiler for code development because intel is so slow.
What's the best way to approach the issue? Are there some general rules about code structures to avoid? Or good ways to identify the offending procedures or lines of code?
from cism.
I don't have any good strategies for approaching this. I would probably start by identifying the offending file(s) by looking at the build time of each file. I'm not sure if there's a way to get build time information for each file in the build log (@jedwards4b do you know?); if not, you could set GMAKE_J=1 then watch the build log output and see if it stalls out on a file. Assuming you can identify a problematic file, you could look at the diffs between cism2_1_78 and cismwrap_2_1_79 to see if anything looks like a likely culprit. But I'm not sure how easy it will be to identify that. I guess my hope would be that we could identify an offending file without too much trouble, and then, if we're lucky, the diffs won't be too extensive and/or there will be something fairly obviously weird about the changes in that file....
@jedwards4b do you have any suggestions for a better way to look into this? Also, I'm wondering if, before spending a lot of time on this, it would be worth trying the build with a more recent version of the intel compiler (we're using v 19 on cheyenne, so 3 years old): it may be that the problem goes away with a more recent compiler version, in which case it might not be worth spending a lot of time trying to figure this out. However, I'm also not sure how hard it would be to get the build working with a newer intel version.
from cism.
So if you look at the timestamps of the object files produced I think you can get some idea of what is going on:
For example this build started at 13:13 as evidenced by the timestamp of the Filepath file:
-rw-r--r-- 1 jedwards ncar 342 Apr 20 13:13 Filepath
and ended at 13:20 with the nuopc cap file
-rw-r--r-- 1 jedwards ncar 76424 Apr 20 13:20 glc_comp_nuopc.o
It looks like most of the time was spent in compiling the glide_io file:
-rw-r--r-- 1 jedwards ncar 238870 Apr 20 13:14 glissade_velo.mod
-rw-r--r-- 1 jedwards ncar 73564004 Apr 20 13:18 glide_io.mod
-rw-r--r-- 1 jedwards ncar 449775 Apr 20 13:19 glide_stop.mod
from cism.
Thanks for pointing this out guys. I've been looking at it this afternoon (starting to have more time for land ice work!) and I do see 4 minutes or so spent building glide_io.F90. This file is auto-generated at build time, but that doesn't actually seem to be the slow part. The slow part is the actual compiling of the file. Now, I know a big difference between cism2_1_78 and cismwrap_2_1_79 was the number of namelist fields. We added several new namelist and output fields between these tags. I'm not sure, but I think glide_io.F90 became much longer after this tag. And, I'm noticing this file uses a weird c def method for defining file paths:
#define NCO outfile%nc
#define NCI infile%nc
And both of these c-def variables are referenced many, many times:
if (.not.outfile%append) then
status = parallel_def_dim(NCO%id,'x0',model%parallel%global_ewn-1,x0_dimid)
else
status = parallel_inq_dimid(NCO%id,'x0',x0_dimid)
endif
This type of using c-defined objects with properties referenced is not something I've seen very often. I could see an Intel Fortran compiler (or another fortran compiler) having some issues with it.
from cism.
@Katetc, Nice sleuthing. If you have time tomorrow, let's follow up and talk about whether we can get the same functionality without the c-def variables.
from cism.
I would be surprised if the cpp macros were the cause of the slowdown.
from cism.
@jedwards4b, is there another possible explanation?
from cism.
The file glide_io.F90 is autogenerated, but that step happens very quickly. It is the fortran compile of the autogenerated file that is taking so long. I timed it at 4:47 with -O2 and 4:14 with -O0. Subroutine glide_io_create is some 7000 lines.
from cism.
@jedwards4b, Indeed it's a long file, but there are other big files in CISM that compile in a few seconds. I'm wondering if there are specific structures in the autogenerated file that trip up the Intel compiler (but which the gnu compiler, for whatever reason, handles more efficiently). If we can identify those structures, then we may be able to modify the autogenerate script to do things differently.
from cism.
Here's another possibility. At the end of module glide_io.F90 there are many accessor subroutines, of the form glide_set_field(data, inarray) and glide_get_field(data,outarray). Each subroutine uses four modules (glimmer_scales, glimmer_paramets, glimmer_physcon, glide_types) without an 'only' specification. Is it taking the compiler a long time to bring in the other modules? If so, we could either figure out a way to add the appropriate 'only', or do without these subroutines entirely. The used modules, especially glide_types, have grown over time.
from cism.
It also seems possible that just having so many separate use statements could cause problems, whether or not they have an "only" clause. What about consolidating them so that they appear at the top of the module rather than being separately listed for each subroutine?
from cism.
@billsacks, That's a good suggestion, and easy to implement. I'll give it a try.
from cism.
Related Issues (20)
- If no problem type is specified in .config file, "EISMINT" error occurs HOT 2
- empty 'results' file created during each run HOT 8
- Are changes needed to support a Gregorian (leap year) calendar? HOT 14
- with evolve_ice turned off, beta_internal differs on restart
- Remove SLAP at some point HOT 3
- DIVA is inaccurate when flwa varies strongly vertically HOT 11
- Come up with a way to shorten long lines that reference __FILE__ HOT 3
- Minor issues with SLAP HOT 2
- libglissade IO issues HOT 4
- Encapsulate module-level data in derived types, for parallel module and maybe others HOT 1
- For Antarctica simulations in CESM, SMB can accumulate in the ocean HOT 34
- Glad's is_in_active_grid is slightly inconsistent with logic elsewhere in CISM HOT 2
- SLAP issue with cce (cray compiler) HOT 4
- Some python files have python2 type style statements HOT 2
- writestats.c fails with intel/2023.0.0 icx compiler HOT 1
- Runtime error with nag6.2 compiler on hobart HOT 1
- time dependent forcing applied at incorrect time in timestep HOT 2
- additional testing of centered vs. upwided surface gradient calculations in Glissade dycore HOT 2
- Linux (Ubuntu) build problems. HOT 19
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cism.