Comments (18)
Tested Ratko's latest version of his branch (852405d) on Hera using my C424 test case: /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug
Intel/Release, Intel/Debug, GNU/Release, GNU/Debug all gave bit identical answers. Yeah!
from ufs_utils.
A small test script with data was created to run the filter_topo program stand-alone (on venus):
/gpfs/dell2/emc/modeling/noscrub/George.Gayno/filter_topo_bug
from ufs_utils.
I tested the script on Venus using 'develop' at 0f5f3cb. A baseline 'oro' file was created using the standard compile. I then recompiled with the 'debug' option on:
diff --git a/build_all.sh b/build_all.sh
index fe457c9c..70936332 100755
--- a/build_all.sh
+++ b/build_all.sh
@@ -26,7 +26,7 @@ else
set -x
fi
-CMAKE_FLAGS="-DCMAKE_INSTALL_PREFIX=../ -DEMC_EXEC_DIR=ON"
+CMAKE_FLAGS="-DCMAKE_INSTALL_PREFIX=../ -DEMC_EXEC_DIR=ON -DCMAKE_BUILD_TYPE=Debug"
I was hoping it would crash with an out of bounds array or divide by zero. But it ran to completion. The output did not match the baseline:
+ nccmp -dmfqS oro.C424.tile7.nc oro.C424.tile7.nc.1pass
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 415 2.26251 2429.39 -231.994 232.415 464.409 0.00545183 24.9542
Some differences are large. All differences were confined to the top and bottom three rows.
from ufs_utils.
A small test script with data was created to run the filter_topo program stand-alone (on venus):
/gpfs/dell2/emc/modeling/noscrub/George.Gayno/filter_topo_bug
Ported the test script to Hera - /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug
from ufs_utils.
Tested the script on Hera. Compiled 'develop' at 0f5f3cb using the GNU compiler with the debug option (DCMAKE_BUILD_TYPE=Debug). Was hoping it would find a memory leak or an out-of-bounds array. It did crash with memory leaks. To fix the leaks, I updated the source code as follows:
--- a/sorc/grid_tools.fd/filter_topo.fd/filter_topo.F90
+++ b/sorc/grid_tools.fd/filter_topo.fd/filter_topo.F90
@@ -81,9 +81,13 @@ program filter_topo
call FV3_zs_filter(is,ie,js,je,isd,ied,jsd,jed,npx,npy,npx,ntiles,grid_type, &
stretch_fac, nested, area, dxa, dya, dx, dy, dxc, dyc, sin_sg, oro, regional )
+ deallocate(sin_sg, dx, dy, mask, dxc, dyc, area, dxa, dya)
+
!--- write out the data
call write_topo_file(is,ie,js,je,ntiles,oro(is:ie,js:je,:),regional )
+ deallocate (oro)
+
contains
The test then ran to completion.
from ufs_utils.
Ran the test on Hera four times:
- Intel with 'release' build type.
- Intel with 'debug' build type.
- GNU with 'release' build type.
- GNU with debug build type.
The output files from each test were compared.
The two Intel tests showed large differences indicative of a problem:
$ nccmp -dmfqS oro.C424.tile7.nc.intel.O3 oro.C424.tile7.nc.intel.debug
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 415 -2.26251 2429.39 -232.415 231.994 464.409 -0.00545183 24.9542
The two GNU tests showed very small differences:
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.O3 oro.C424.tile7.nc.gnu.debug
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 1 -4.13247e-07 4.13247e-07 -4.13247e-07 -4.13247e-07 0 -4.13247e-07 0
The GNU debug and Intel debug showed very small differences:
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.debug oro.C424.tile7.nc.intel.debug
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 1 4.13247e-07 4.13247e-07 4.13247e-07 4.13247e-07 0 4.13247e-07 0
The Intel debug and GNU release tests produced no differences:
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.O3 oro.C424.tile7.nc.intel.debug
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
What does this say? That the Intel compiler optimizations are not playing nice with the filter code?
from ufs_utils.
Compiled my bugfix/filter_topo
branch on Hera at 0f5f3cb, acba0fb, and dbe2ba5. Reran the test and compared the output files. None of the changes I made should change results.
Intel 'release' build type:
Comparing 0f5f3cb and acba0fb shows no differences.
$ nccmp -dmfqS oro.C424.tile7.nc.intel.O3.0f5f3cb oro.C424.tile7.nc.intel.O3.acba0fb
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
Comparing acba0fb and dbe2ba5 shows large differences.
$ nccmp -dmfqS oro.C424.tile7.nc.intel.O3.acba0fb oro.C424.tile7.nc.intel.O3.dbe2ba
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 534 -0.222672 812.112 -64.0984 66.2278 130.326 -0.000416988 5.64295
Intel 'debug' build type:
Comparing 0f5f3cb and acba0fb shows no differences.
$ nccmp -dmfqS oro.C424.tile7.nc.intel.debug.0f5f3cb oro.C424.tile7.nc.intel.debug.acba0fb
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
Comparing acba0fb and dbe2ba5 shows large differences.
$ nccmp -dmfqS oro.C424.tile7.nc.intel.debug.acba0fb oro.C424.tile7.nc.intel.debug.dbe2ba
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 46 0.157118 107.935 -11.7211 32.257 43.978 0.0034156 6.00324
from ufs_utils.
Repeated the tests using GNU.
GNU 'release' build:
Comparing 0f5f3cb and acba0fb shows no differences.
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.O3.0f5f3cb oro.C424.tile7.nc.gnu.O3.acba0fb
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
Comparing acba0fb and dbe2ba5 shows no differences.
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.O3.acba0fb oro.C424.tile7.nc.gnu.O3.dbe2ba
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
GNU 'debug' build:
Comparing acba0fb and dbe2ba5 shows no differences.
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.debug.acba0fb oro.C424.tile7.nc.gnu.debug.dbe2ba
/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/filter_topo_bug/data $
Recall, the 0f5f3cb tag would not run in 'debug' mode because of memory leaks.
Comparing the 'debug' and 'release' builds at dbe2ba5 shows very small differences, as expected:
$ nccmp -dmfqS oro.C424.tile7.nc.gnu.O3.dbe2ba oro.C424.tile7.nc.gnu.debug.dbe2ba
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 1 -4.13247e-07 4.13247e-07 -4.13247e-07 -4.13247e-07 0 -4.13247e-07 0
from ufs_utils.
There are several arrays being allocated in routine "read_grid_file". Some are initialized with a constant value, while others are not initialized at all. I wonder if the compilers are filling these arrays with random garbage that should be later overwritten (but is not) by valid data. So I updated the branch to initialize these arrays. The values I chose (some were set to a large number, and some to zero) were made so the test would match the baseline. If the code is working properly, then I expect I could initialize these arrays to any value and the result should not not change.
Using the GNU 'debug' option, which is giving me a consistent (but not necessarily correct) answer, I ran ccc6692 on Hera, then compared to my baseline:
$ nccmp -dmfqS oro.C424.tile7.nc oro.C424.tile7.nc.gnu.debug.dbe2ba
Variable Group Count Sum AbsSum Min Max Range Mean StdDev
orog_filt / 1 -4.13247e-07 4.13247e-07 -4.13247e-07 -4.13247e-07 0 -4.13247e-07 0
This is a very small difference. Next, I will check each array.
from ufs_utils.
Something is not right about the setting of 'geolon_c' and 'geolat_c' in 'read_grid_file'. These variables are allocated with a halo:
allocate(geolon_c(isd:ied+1,jsd:jed+1,ntiles))
allocate(geolat_c(isd:ied+1,jsd:jed+1,ntiles))
For my test, the indices are:
isd,ied,jsd,jed -2 225 -2 207
Later the 'grid' file is read and the arrays (not including the halo) are initialized:
geolon_c(1:npx,1:npy,t) = tmpvar(1:ni+1:2,1:nj+1:2)*PI/180.
geolat_c(1:npx,1:npy,t) = tmpvar(1:ni+1:2,1:nj+1:2)*PI/180.
For my test, the indices are:
npx,npy 223 205
At this point the halo contains who knows what.
Later, the routine tries to use these halo values:
do j=jsd,jed
do i=isd+1,ied
g1(1) = geolon_c(i,j,t); g1(2) = geolat_c(i,j,t)
g2(1) = geolon_c(i-1,j,t); g2(2) = geolat_c(i-1,j,t)
Interestingly, there is a routine to fill in halo values. But since this is a regional grid (regional=.true.), it is never called:
if( .not. regional ) then
call fill_cubic_grid_halo(geolon_c, geolon_c, ng, 1, 1, 1, 1)
call fill_cubic_grid_halo(geolat_c, geolat_c, ng, 1, 1, 1, 1)
if(.not. nested) call fill_bgrid_scalar_corners(geolon_c, ng, npx, npy, isd, jsd, XDir)
if(.not. nested) call fill_bgrid_scalar_corners(geolat_c, ng, npx, npy, isd, jsd, YDir)
endif
Will ask the regional group for their take on this.
from ufs_utils.
I got to same line. Since computation of the dxc and dyc requires geolat_c's and geolon_c's halo points, I'll try to fill in halo points like in global case.
from ufs_utils.
I created routine which extrapolates geolat_c and geolon_c halo points. These points are used for dxc and dyc calculation. This should fix the problem.
from ufs_utils.
I created routine which extrapolates geolat_c and geolon_c halo points. These points are used for dxc and dyc calculation. This should fix the problem.
For regional grids, we create several 'grid' files with and without halos. Instead of extrapolating, should one of the 'halo' 'grid' files be used? We only care about filtering on the halo0 file, correct?
from ufs_utils.
I guess we should be OK since it is done for global and regional nest similar way:
if( .not. regional ) then
call fill_cubic_grid_halo(geolon_c, geolon_c, ng, 1, 1, 1, 1)
call fill_cubic_grid_halo(geolat_c, geolat_c, ng, 1, 1, 1, 1)
if(.not. nested) call fill_bgrid_scalar_corners(geolon_c, ng, npx, npy, isd, jsd, XDir)
if(.not. nested) call fill_bgrid_scalar_corners(geolat_c, ng, npx, npy, isd, jsd, YDir)
else
call fill_regional_halo(geolon_c, ng)
call fill_regional_halo(geolat_c, ng)
endif
from ufs_utils.
@RatkoVasic-NOAA Are you working from my branch? I don't see your fork.
from ufs_utils.
I used hard copy:
Hera:/scratch1/NCEPDEV/nems/Ratko.Vasic/filter_topo/UFS_UTILS/sorc/grid_tools.fd/filter_topo.fd
from ufs_utils.
I used hard copy:
Hera:/scratch1/NCEPDEV/nems/Ratko.Vasic/filter_topo/UFS_UTILS/sorc/grid_tools.fd/filter_topo.fd
Since the coding changes are non-trivial, I would create a fork and branch: https://github.com/NOAA-EMC/UFS_UTILS/wiki/3.-Checking-out-code-and-making-changes
from ufs_utils.
I'll do it.
from ufs_utils.
Related Issues (20)
- Improve error handling in the orog code HOT 2
- Update RDHPCS Hera to Rocky-8 compliance HOT 4
- Add gdas_init utility to readthedocs
- Release Version 1.13.0 HOT 2
- Update for Rocky8 on Jet
- rotation angles on corner grid points are wrong in cpld_gridgen
- Add option to build chgres_cube without nemsio and sigio input file support HOT 2
- Update documentation for new AWS fixed files host
- Create utility to down-scale ice and ocean Restarts from 1/4deg to desired resolution HOT 4
- wgrib2 module used by regression tests on Jet was removed HOT 1
- Clean up fv3gfs_make_orog.sh script
- Update version of sfcsub.F in global_cycle program HOT 2
- Compile ufs_utils on Gaea-C5
- add module files to compile on AWS
- Enable OpenMP threading to speed up orog_gsl and filter_topo programs HOT 23
- orog.fd - Remove binary output and reduced grid options
- create two additional remapping weights using cpld_gridgen for use by ocnice prep; deprecate two weights files and regresssion test dependency
- Create unit tests for ocean_merge code HOT 10
- build UFS_UTIL error on Hera HOT 7
- Optimize orog.fd code HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ufs_utils.