Giter VIP home page Giter VIP logo

agcd-csiro's Introduction

agcd-csiro

This repository contains scripts for creating a replica of CSIRO's commercially licensed version of the AGCD dataset on NCI.

The commercially licensed AGCD dataset is located at /datasets/work/af-cdp/work/agcd/ on Petrichor. (The full catalogue for the Digiscape Climate Data Portal is here.) The replica data is located at /g/data/xv83/agcd-csiro/ on NCI.

Daily data

Data transfer

The data transfer scripts/commands need to be run from Petrichor.

The transfer of the daily historical data files was done using the transfer script:

$ ssh [email protected]
$ git clone https://github.com/AusClimateService/agcd-csiro.git
$ cd agcd-csiro
$ bash transfer_agcd-historical_daily.sh {variable} {nci_username} {nci_password}

The latest daily data (updated daily) can be transferred using scp. e.g.

$ scp /datasets/work/af-cdp/work/agcd/climate/tmax.nc [email protected]:/g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc

Data processing

The latest data has some issues (e.g. duplication of time steps) and overlaps in time with the historical data.

These issues can be addressed by running the processing script on NCI:

$ cd /g/data/xv83/agcd-csiro/replica-code
$ git pull origin main
$ python process_current.py /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc tmax /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20190101-20191231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20200101-20201231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20210101-20211231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20220101-20220402_daily.nc
$ rm /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc

Rechunking

$ python to_rechunked_zarr.py /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_*_daily.nc tmax /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_19100101-20220402_daily_time-chunked.zarr /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_19100101-20220402_daily_space-chunked.zarr /g/data/xv83/agcd-csiro/tmax/daily/temporary.zarr

Monthly data

There's no historical data included in the commercially licensed AGCD dataset. A research-only copy of the monthly historical data (up to mid-2020) is available on NCI in project zv2 (/g/data/zv2/agcd/v2/precip/total/r005/01month).

The latest monthly data (from 2020 or late 2019 onwards) is available as part of the commercial dataset for precipitation, tmax and tmin (there's no monthly vapour pressure data).

Data transfer

The data transfer commands need to be run from Petrichor:

scp /datasets/work/af-cdp/work/agcd/tmax/data/IDCKZX1A90_tmax_mean_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/tmax/monthly
scp /datasets/work/af-cdp/work/agcd/tmin/data/IDCKZN1A90_tmin_mean_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/tmin/monthly
scp /datasets/work/af-cdp/work/agcd/precip/data/IDCK2R1AT0_precip_total_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/precip/monthly

For the precipitation data it looks like IDCK2R1AT0 is v2 and IDCKZR1AT0 is v1, although when comparing against the data in zv2 the values aren't exactly the same. (See precip_comparison.ipynb for details.)

Data processing

The files then simply need to be merged using cdo. For example:

cdo mergetime /g/data/xv83/agcd-csiro/tmax/monthly/IDCKZX1A90_tmax_mean_r005_2021*.nc /g/data/xv83/agcd-csiro/tmax/monthly/agcd_v1_tmax_mean_r005_monthly_2021.nc

agcd-csiro's People

Contributors

damienirving avatar

Watchers

 avatar Mitchell Black avatar

Forkers

taerwin

agcd-csiro's Issues

Obfuscation

Noting this is a public repo, for involved users the code snippets in the readme are very useful, however there may be an argument to obfuscate the filepaths and especially user name - or make the repo private to the org if possible?
Ping @DamienIrving

Permission to run updates

Hi @DamienIrving ,

It was noted that this could be automated on the CSIRO side to keep the xv83 holding up to date (thanks @ Tony), however currently you are the only person who can write to the agcd-csiro directory on NCI.
Could you please set ACLs and permissions such that any member of xv83_w could run the scripts here?
What do you think about the value of setting up a cron job on the CSIRO side to push the data across? But we can't do cron on the NCI side to process it so maybe not wise?

-Claire

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.