Giter VIP home page Giter VIP logo

aqm-utils's People

Contributors

briancurtis-noaa avatar chan-hoo avatar jianpinghuang-noaa avatar ytangnoaa avatar zmoon avatar

Watchers

 avatar  avatar  avatar

aqm-utils's Issues

GEFS2LBC_PARA: GNU Build Error due to Divide by Zero

When building UFS-SRW-App [develop] ATMAQ on other machines that use GNU Fortran compiler (e.g., GMU Hopper), there is issue with compiling gefs2clbcs_para because of a purposeful divide by zero in the code. This should be revised to satisfy building UFS-SRW-App using different compilers.

/sorc/AQM-utils/sorc/gefs2clbcs_para.fd/gefs2lbc_para.f90:588:15:

588 | fillval=0./0.
| 1
Error: Division by zero at (1)

@ytangnoaa

bias_correction_pm25 executable error output issue

bias_correction_pm25 executable aqm_bias_correct failed to output error in error file when job failed.
Request developer take action to fix:
(1) Add exception handling for ex-script to check for dead slink target before running the executable.
(2) Replace cp in job with production utility cpreq (required by NCO implementation standard).
(3) Debug and fix the root cause for executable aqm_bias_correct not output fail information in errfile (required by NCO implementation standard).

Debug info:
Cactus job log from AQM implementation ecflow parallel shown job failed:
/lfs/h2/emc/ptmp/lin.gan/ecflow_aqm/para/output/prod/today/aqm_bias_correction_pm25_00.o120073813
The executable should show failing information in errfile. But it only show in OUTPUT file

  • exaqm_bias_correction_pm25.sh[230]: eval time /lfs/h2/emc/global/noscrub/lin.gan/para/packages/aqm.v7.0.1/exec/aqm_bias_correct config.pm2.5.bias_corr_793.00z 00z 20230305 20240304 '>>/lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_bias_correction_pm25_00.120073813.cbqs01/OUTPUT.214929' '2>/lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_bias_correction_pm25_00.120073813.cbqs01/errfile'

As:
/lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_bias_correction_pm25_00.120073813.cbqs01/data> cat /lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_bias_correction_pm25_00.120073813.cbqs01/OUTPUT.214929|tail -9
*** All files missing for group: data/bcdata.202403/grid/00z/20240304/aqm.t00z.
chem_sfc.fFFF.nc
*** read_gridded_vars: Error reading gridded data for cycle 2024 03 04 00Z
*** Abort output file for this cycle.

*** spreading: Raw PM25_TOT forecast grids for target forecast cycle are not available.
*** Target forecast cycle = 2024 03 04 00Z
*** Fatal, can not complete bias correction for this forecast cycle.

Improve exaqm_bias_correction_o3 solution to avoid failure

Job exaqm_bias_correction_o3 00Z running in AQM implementation ecflow parallel for 2024030400 failed.

Request for developer to implement the following fixes:
(1) Developer should use production utility "cpreq" instead of "cp" in exaqm_bias_correction_o3.sh
(2) exaqm_bias_correction_o3.sh[222] should copy data used for the job from its DATA location instead of from dcom again. Because those files were checked, used, and copied from dcom to DATA. Therefore, at the end of the job. It should be copy from DATA to ensure it is actually a true copy of the file been processed in the current job.
(3) exaqm_bias_correction_o3.sh[149] only give warning message. But line 213 copy those files with no exception handling logic to avoid file not presented.

Debug information follow:
line 4342 show:
exaqm_bias_correction_o3.sh[149] statement:
cp ${DCOMINairnow}/${PDYm1}/airnow/HourlyAQObs_${PDYm1}*.dat ${COMOUTbicor}/bcdata.${yyyymm_m1}/airnow/csv/${yyyy_m1}/${PDYm1}
Failed

  • exaqm_bias_correction_o3.sh[222]: cp ...
    cp: cannot stat '/lfs/h1/ops/prod/dcom/20240303/airnow/HourlyAQObs_2024030320.dat': No such file or directory
    Debug information:
    line 149:
    if [ "$(ls -A ${DCOMINairnow}/${cvt_pdy}/airnow)" ]; then
    cp ${DCOMINairnow}/${cvt_pdy}/airnow/HourlyAQObs_${cvt_pdy}.dat "${cvt_input_dir}/${cvt_yyyy}/${cvt_pdy}"
    else
    message_warning="WARNING: airnow data missing. skip this date ${cvt_pdy}"
    print_info_msg "${message_warning}"
    fi
    line 2883 show HourlyAQObs_2024030320.dat was found in statement:
    ls -A /lfs/h1/ops/prod/dcom/20240303/airnow/HourlyAQObs_${cvt_pdy}
    .dat

exaqm_bias_correction_o3.sh[149] statement:
cp ${DCOMINairnow}/${cvt_pdy}/airnow/HourlyAQObs_${cvt_pdy}*.dat "${cvt_input_dir}/${cvt_yyyy}/${cvt_pdy}"
has already copy files to
/lfs/h2/emc/stmp/lin.gan/aqm/ecflow_aqm/aqm_bias_correction_o3_00.120073811.cbqs01/data/bcdata.202403/airnow/csv/2024/20240303
-rw-r--r-- 1 lin.gan emc 1.1M Mar 4 02:08 HourlyAQObs_2024030320.dat

aqm_lbcs job failed on Cactus and Dogwood because of loading module w3nco/2.4.1

The job failed on both Cactus and Dogwood. It complains the

++ bash[71]: /usr/share/lmod/lmod/libexec/lmod bash load aqm_lbcs.local
Lmod has detected the following error: These module(s) or extension(s) existo
but cannot be loaded as requested: "w3nco/2.4.1"
Try: "module spider w3nco/2.4.1" to see how to load the module(s).

============
However, the module was loaded successfully manually on Dogwood.

@chan-hoo

Code compilation warning issues for AQMv7 implementation

The AQMv7 package was returned due to the violation of EE2 compliance. Code compilation warnings are the part of the reasons.

  1. Code checkout
    repo_url = https://github.com/NOAA-EMC/AQM-utils
    hash = d8e6299

  2. Warning message

2.1) ~/sorc/aqm_post_grib2.fd/CMakeFiles/aqm_post_grib2_lib.dir
sorc/aqm_post_grib2.fd/CMakeFiles/aqm_post_grib2_lib.dir/depend.make:245: warning: overriding recipe for target 'sorc/aqm_post_grib2.fd/CMakeFiles/aqm_post_grib2_lib.dir/parse_varexp_mod.mod.stamp'
sorc/aqm_post_grib2.fd/CMakeFiles/aqm_post_grib2_lib.dir/depend.make:188: warning: ignoring old recipe for target 'sorc/aqm_post_grib2.fd/CMakeFiles/aqm_post_grib2_lib.dir/parse_varexp_mod.mod.stamp'

2.2) ~/sorc/aqm_post_bias_cor_grib2.fd/CMakeFiles/aqm_post_bias_cor_grib2_lib.dir/depend.make
sorc/aqm_post_bias_cor_grib2.fd/CMakeFiles/aqm_post_bias_cor_grib2_lib.dir/depend.make:245: warning: overriding recipe for target 'sorc/aqm_post_bias_cor_grib2.fd/CMakeFiles/aqm_post_bias_cor_grib2_lib.dir/parse_varexp_mod.mod.stamp'
sorc/aqm_post_bias_cor_grib2.fd/CMakeFiles/aqm_post_bias_cor_grib2_lib.dir/depend.make:188: warning: ignoring old recipe for target 'sorc/aqm_post_bias_cor_grib2.fd/CMakeFiles/aqm_post_bias_cor_grib2_lib.dir/parse_varexp_mod.mod.stamp'

2.3) ~/sorc/aqm_post_maxi_bias_cor_grib2.fd/CMakeFiles/aqm_post_maxi_bias_cor_grib2_lib.dir/depend.make:180:
warning: overriding recipe for target 'sorc/aqm_post_maxi_bias_cor_grib2.fd/CMakeFiles/aqm_post_maxi_bias_cor_grib2_lib.dir/parse_varexp_mod.mod.stamp'
sorc/aqm_post_maxi_bias_cor_grib2.fd/CMakeFiles/aqm_post_maxi_bias_cor_grib2_lib.dir/depend.make:123: warning: ignoring old recipe for target 'sorc/aqm_post_maxi_bias_cor_grib2.fd/CMakeFiles/aqm_post_maxi_bias_cor_grib2_lib.dir/parse_varexp_mod.mod.stamp'

@BrianCurtis-NOAA @KaiWang-NOAA

Can you help this?

gefs2clbcs failed when the code was compiled with debug flags of "-ftrapuv -check all"

gefs2clbcs failed when the code was compiled with debug flags of "-ftrapuv -check all". This is part of IT tests required by NCO to support AQMv7 implementation.

The CMakeLists.txt is available on Cactus at

/lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/aqm.v7.0.88i/sorc/AQM-utils

Below is error message,

FAST_BYTESWAP ALGORITHM HAS BEEN USED AND DATA ALIGNMENT IS CORRECT FOR 4 )
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
gefs2lbc_para 00000000004E6C5B Unknown Unknown Unknown
libpthread-2.31.s 000014F6305968C0 Unknown Unknown Unknown
gefs2lbc_para 000000000044B467 Unknown Unknown Unknown
gefs2lbc_para 000000000040BB52 Unknown Unknown Unknown
libc-2.31.so 000014F63027524D __libc_start_main Unknown Unknown
gefs2lbc_para 000000000040BA6A Unknown Unknown Unknown
nid001086.cactus.wcoss2.ncep.noaa.gov: rank 1 died from signal 6 and dumped core
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
gefs2lbc_para 00000000004E6C5B Unknown Unknown Unknown
libpthread-2.31.s 000014AD51E7B8C0 Unknown Unknown Unknown
gefs2lbc_para 00000000005851EA Unknown Unknown Unknown
gefs2lbc_para 0000000000539651 Unknown Unknown Unknown
gefs2lbc_para 0000000000449445 Unknown Unknown Unknown
gefs2lbc_para 000000000040BB52 Unknown Unknown Unknown
libc-2.31.so 000014AD51B5A24D __libc_start_main Unknown Unknown
gefs2lbc_para 000000000040BA6A Unknown Unknown Unknown
Application 1c511ef2-628e-4961-a083-528997f48a3e resources: utime=37s stime=3s maxrss=1595632KB inblock=7067908 oublock=24 minflt=36965 majflt=26 nvcsw=2253 nivcsw=81

Failure of reprocessing RAVE data files

I continued to meet failures with the processing the update RAVE data during the new retrospective runs.

  1. J-job:
    /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/aqm.v7.0.82b/job

  2. ex-script: exregional_fire_emission.sh
    /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/aqm.v7.0.82b/scripts/

  3. AQM-utils: RAVE_remake.allspecies.aqmna13km.g793.py
    /lfs/h2/emc/physics/noscrub/jianping.huang/nwdev/packages/aqm.v7.0.82b/sorc/AQM-utils/python_utils

  4. run log file: Error message can be found from the following run log files
    fire_emission_2022062406.id_1689597573.log.0
    fire_emission_2022062412.id_1689597573.log.0

/lfs/h2/emc/ptmp/jianping.huang/emc.para/output/20220624 (Dogwood)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.