oloapinivad / ecmean4 Goto Github PK
View Code? Open in Web Editor NEWEC-Earth basic evaluation tool
License: Apache License 2.0
EC-Earth basic evaluation tool
License: Apache License 2.0
We would like to have a more elaborated version of the global mean, able to deal with multiple seasons and regions as done recently with #46 for performance indices.
We will need to create a script which performs global mean operations as done by the code on a restricted set of observations, and then output everything as yaml file. It could be amazing if we can store also the variance for each variable/region/season so that we can provide an estimate of the error.
The global mean output should be also converted in a pdf similarly to performance indices. However, we would like to have the color to point the amount of standard deviation we are far from the observation, and to report the model value in the heatmap. It will be great to show also the value of the observations.
I noticed a significant slowdown of the the global mean tool especially when processing some CMIP6 data.
It seems more evident for models which have a big file which includes multiple years, while it is not evident when single year are processed.
I suspect this is due to the way the file list is being created, might deserve an important refactoring.
In computing global means of ocean variables for CMOR data, we should actually use the information in sftof (the fraction of a cell covered by ocean). For NEMO this is always 100%, but not necessarily so for other models which might have fractional coverage.
It will be nice to have a tool to produce in a robust way estimate on the code speed on a pre-defined set of data (let's say 30 years of CMIP6 data) with a set of different cores. This can be very useful when developing new functionalities, or to identify where bottleneck as mentioned in #50, and perhaps to have something ported directly on the documentation.
It is unclear if some existing tool can be used for this, need to be investigated.
The file variance_levitus_SSS.nc
actually contains sea ice variance (it is identical with variance_GISS_SICE.nc)
! Obviously this leads to a crazy value of the SSS performance index.
The PR #31 (SPHINX docs) also (I guess by mistake) changed interfaces/interface_CMIP6.yml
, probably to adapt it to the particular file structure by @oloapinivad .
The previous version actually was compatible with the directory structure as exposed by synda (available on our machines wilma and mafalda)
In branch fixcmip6 I reintroduce the previous version but I also kept Paolo's version as interfaces/interface_CMIP6_PD.yml
(so it can be used with -i CMIP6_PD
). We can have different interface files for different directory structures.
So far the multiprocessing of performance indices is based on a subdivision of the variables in chunks which is done according to the available processors and the variable list.
However, it often occurs that all 3d variables - which are the most computational intensitve - are put togheter. It would be significantly more efficient if these are shared among the processors. Also exploiting of parallel computation by dask should be investigated.
More in general, we can think about find a way to estimate the speed of these code with a testing procedure. Unfortunately, I am not sure that speed can be exploited on Github.
In branch documentation
(and in documentation-api
) we implemented a initial version of sphinx+readthedocs to provide an automatic documentation procedure.
The default documentation works fine, and can be fine https://ecmean4.readthedocs.io/en/latest/
However, the autodoc part, which builds the description of the functions starting from the function descriptions, appears to be a pain. Indeed, it works smoothly locally using sphinx in two different configuration, which involve the use of the sphinx-api (7376c01) or of the autosummary (cae29f2).
From a "style" point of view, the sphinx-api solution seems better, so that this is considered the preferred one so far and it is installed on the webpage.
When we move on the server using readthedocs, both fails since they appear to not be able to find the correct modules. I suspect this is associated with the interdependnecies of the two scripts, which probably should be two different modules, but still it puzzles me a lot that both crashes.
An example of the crash for the sphinx-api solution: actually these are just errors.
WARNING: autodoc: failed to import module 'global_mean'; the following exception was raised:
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/envs/latest/lib/python3.9/site-packages/sphinx/ext/autodoc/importer.py", line 62, in import_module
return importlib.import_module(modname)
File "/home/docs/.asdf/installs/python/3.9.7/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/checkouts/latest/global_mean.py", line 23, in <module>
from ecmean import var_is_there, load_yaml, \
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/checkouts/latest/ecmean.py", line 17, in <module>
cdo = Cdo()
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/envs/latest/lib/python3.9/site-packages/cdo.py", line 187, in __init__
self.operators = self.__getOperators()
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/envs/latest/lib/python3.9/site-packages/cdo.py", line 278, in __getOperators
version = parse_version(getCdoVersion(self.CDO))
File "/home/docs/checkouts/readthedocs.org/user_builds/ecmean4/envs/latest/lib/python3.9/site-packages/cdo.py", line 78, in getCdoVersion
proc = subprocess.Popen([path2cdo, '-V'], stderr=subprocess.PIPE, stdout=subprocess.PIPE)
File "/home/docs/.asdf/installs/python/3.9.7/lib/python3.9/subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/home/docs/.asdf/installs/python/3.9.7/lib/python3.9/subprocess.py", line 1696, in _execute_child
and os.path.dirname(executable)
File "/home/docs/.asdf/installs/python/3.9.7/lib/python3.9/posixpath.py", line 152, in dirname
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
There is no error raising when a variable that does not exist in the climatology is raised...
There are a few cases where the warning/error when missing files are encountered are badly presented. At least two cases must be found
In order to facilitate comparison with previous model iterations and with other models, being able to run the tools also on CMIP out in CMOR format is desirable. Being explored in the new cmor
branch.
The definition of the grid files and masking needs to be homogenised between the two tools
Interface files are very convenient but apparently we forgot to consider the hypothesis when a variable is not saved under a cmor name. As an example we consider 2t
as output from raw IFS data or ERA5 reanalysis. We cannot change the variable name in the config.yml
since this will break any comparison.
We should introduce a the possibility that the interface file have a specification of which variable to be loaded/seeked, that might be different from the cmor name. Something as
tas:
varload: '2t'
varnname: '2m Temperature'
filetype: atm2d
The new varload
should be present everywhere in the code, and set as identical to var
in the case this is not defined. This should also come with a revisitation of the make_input_filename()
function which does not allow for such flexibility
ECmean4 is built on the present of a series of components which tells the inner code how to deal with land-sea masks, grid areas and grid specifications. These are listed within the component
block of the interface files
As an exemple, for cmor they reads as:
component:
cmoratm:
inifile: 'r1i1p1f*/sftlf/sftlf_fx_{model}_{expname}_r1i1p1f*_{grid}.nc'
atmfile: 'r1i1p1f*/sftlf/sftlf_fx_{model}_{expname}_r1i1p1f*_{grid}.nc'
cmoroce:
gridfile: 'r1i1p1f*/sftof/sftof_Ofx_{model}_{expname}_r1i1p1f*_{grid}.nc'
areafile: 'r1i1p1f*/areacello/areacello_Ofx_{model}_{expname}_r1i1p1f*_{grid}.nc'
As another example, EC-Earth4 reads:
component:
oifs:
inifile: 'ICMGG{expname}INIT'
atmfile: 'output/oifs/{expname}_atm_cmip6_1m_{year1}-{year2}.nc'
nemo:
gridfile: nemo-initial-state.nc
areafile: domain_cfg.nc
The naming is non consistent, and also the goal of each this file can be not trivial to understand. In principle what we mandatory need is:
_area_cell()
has been shown to cover many cases)areacello
in CMOR)What is the need of the oceanic gridfile
in this configuration needs to be assessed: this was originally introduced for EC-Earth4 but maybe can be included in the areafile
.
Furthermore, each component
implies different treating by the mask and interpolation functions, and this is very much ad hoc.
A general reorganization must be carried out to support different models.
Oceanic mask is made to work only with official cmip6
data, and it cannot load other variables which. Also, some dangerous unit check is done and this can be a problem in certain occasion.
Furthermore, there is no need to print
WARNING -> No oceanic mask available for oceanic vars, this might lead to inconsistent results...
we should issue this warning only at the beginning...
this seems to be a new problem which was not there in the past, the most of the proposed solution on the web does not work.
The pytest test that has been implemented in the testing
branch includes two flake8
tests that are now included in the workflow. This is done before of the python -m pytest
call, which uses the two basic tests for atmosphere and coupled runs. The workflow is executed to every new commit and on every new pull request on the main
The first one of the flake8
call produces errors and makes the workflow to fail. It actually checks for undefined variables and syntax errors (and it made me spot one undefined vars!)
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
The second produce a full report under the form of a warning. Line length is set according to Github standard.
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
It would be important to run those tests BEFORE doing a pull request or a commit on the main
A very basic way to fix the most of the minor issue that arise from coding is running autopep8
autopep8 --in-place --max-line-length=127 --recursive .
The call of this three commands should be run always before merging a pull request.
This is more related to EC-Earth, but it is a good collateral aspect of using pint
wfo
kg/m2/s ---> m/100years
Unit converson required...
3.1688087814028946e-06 kilogram / meter ** 3 / second ** 2
Units mismatch, this cannot be handled!
What unit is kg/m2/s? perhaps is kg m-2 s-1?
Lines 142 to 146 in 171f6b1
There is no selyear
call in the global_mean.py
code so that if a file includes more than one year all the file are currently processed. This is wrong and occurs quite often for CMIP data. Spotted developing the xarray code.
So far the two main functions simply works from command line.
We will need to restructure the corresponding parsers so that the two functions can run also simply being imported from an external script, which can be helpful if they are called from other tools.
Global mean are computed via the cdo fldmean
command.
However, this seems to be dependent on the grid definition
cdo output -fldmean -timmean -selname,tas -selname,tas
/lus/h2resw01/scratch/ccpd/ece4/MALE/output/oifs/MALE_atm_cmip6_1m_1990-1990.nc
287.088
cdo output -fldmean -timmean -selname,tas -setgridtype,regular -setgrid,/lus/h2resw01/scratch/ccpd/ece4/MALE/ICMGGMALEINIT -selname,tas
/lus/h2resw01/scratch/ccpd/ece4/MALE/output/oifs/MALE_atm_cmip6_1m_1990-1990.nc
287.083
Not a big difference, but it should be considered if have any implication.
As from the title, this makes the CDO based version no longer operative, since it is used to find which what variables are in the file
Input files for every variable can change vastly if models are changed. We will make an attempt to specify this in a more general and flexible way in the interface_* file. This is being developed in the filenames branch (which is a sub branch of parallel, since thta branch, still to be merged, has too large structural changes and it is better to restart from there)
A new branch pint-units is being developed to exploit of unit conversion from model data to required dataset/global mean value
Trying to provide a complete assessment of the CMIP6 data with both global mean and performance indices with the brand new xarray we are facing some issue with some model grid/files. This could be 1) an issue of the ECmean4 and the xESMF current interpolation method 2) some not regular cmor files in the CMIP6 archive.
(between parenthesis the time to process 30 years of data for global mean and performance indices, with 8 cores on wilma)
ValueError: The horizontal shape of input data is (291, 360), different from that of the regridder (292, 362)!
This is to keep track of the development in devel/xarray
. The idea is to convert the cdo engine to xarray which is more self consistent and allows for out-of-core computation via dask.
First commit 82a0661 introduced xr_global_mean.py
which similar parsing structure and support only for EC-Earth4. It requires a conda environment to correctly handle dependencies (safer than the pyEnv originally developed)
Positive aspects:
Negative aspects:
Currently remapbil is used to perform interpolation of all variables in the performance indices regridding. This is not appropriate for upscaling and something like remapcon should be used. The problem is that the unstructured nemo gri is missing corner data and so does not allow the use of remapcon
We introduce a new class to access CDO command chains (or pipelines)
We found out that if there are more available processors than variable to be processed the code might crash.
This is currently investigated.
Developing xarray version of the global_mean.py
I found out that the current filenames expansion i.e. _expand_filename()
does not make any difference between Amon
and Omon
variables, so that if both are available both are loaded and averaged together. This occurs specifically for historical EC-Earth3 on mafalda when computing net_sfc
, which requires snow (prsn
) which is available for both Amon and Omon, but other potential situation like this are possible.
CDO does not complaint since it just do global average, but xarray crashes so that I spotted the issue.
This has to be solved in the main branch: a possible workaround currently used in the xarray branch is to prepend an A and O to the filename definition in the interface_CMIP6.yml
as done here below:
filetype:
atm2d:
filename: '{var}_A{frequency}_{model}_{expname}_{ensemble}_{grid}_{year1}01-{year2}12.nc'
dir: '{ensemble}/{frequency}/{var}/{grid}/{version}'
component: cmoratm
atm3d:
filename: '{var}_A{frequency}_{model}_{expname}_{ensemble}_{grid}_{year1}01-{year2}12.nc'
dir: '{ensemble}/{frequency}/{var}/{grid}/{version}'
component: cmoratm
oce2d:
filename: '{var}_O{frequency}_{model}_{expname}_{ensemble}_{grid}_{year1}01-{year2}12.nc'
dir: '{ensemble}/{frequency}/{var}/{grid}/{version}'
component: cmoroce
ice:
filename: '{var}_{frequency}_{model}_{expname}_{ensemble}_{grid}_{year1}01-{year2}12.nc'
dir: '{ensemble}/{frequency}/{var}/{grid}/{version}'
component: cmoroce
It is not clean but solves all the issues so far.
This is due to some behaviour of seaborn probably, need to be investigated
So far it is not possible to run with py311 since there is a conflict between xESMF, python and esmpy.
xESMF can't work with the most recent version of ESMPY, so that we run with 8.3.1. However, 8.3.1 does not run with python 3.11
This should be addressed in days by xESMF pangeo-data/xESMF#218
The goal is to use threads to exploit paralleization. Parallelization on variable seems to be the best initial solution.
In order to simplify the look up of variables of the two scripts, we need to have a interface_ece4.yml
which handle all the common information.
The requirements file currently contains outdated pinned packages and way more than waht is strictly needed.
A short, lean requirements with only what is needed should be added.
I have just found that the main does not longer work with atm-only run.
Cant check now but I will have a look at this later in the afternoon
(.ECmean4) [ccpd@aa6-100 ECmean4]$ ./performance_indices.py ALFA 1990 1990
PI for net_sfc 18.288219451904297
PI for tas 15.55960464477539
PI for psl 1.8526122570037842
PI for pr 25.827404022216797
PI for tauu 12.488482475280762
PI for tauv 4.452453136444092
PI for ta 6.034306049346924
PI for ua 3.1787490844726562
PI for va 2.8235251903533936
PI for hus 4.428908824920654
Process Process-2:
Traceback (most recent call last):
File "/usr/local/apps/python3/3.8.8-01/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/apps/python3/3.8.8-01/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "./performance_indices.py", line 38, in worker
isavail, varunit = var_is_there(infile, var, face['variables'])
File "/perm/ccpd/ecearth4/ECmean4/ecmean.py", line 99, in var_is_there
ffile = infile[0]
IndexError: list index out of range
Done in 5.4165 seconds
Traceback (most recent call last):
File "./performance_indices.py", line 265, in <module>
main(args)
File "./performance_indices.py", line 218, in main
out_sequence = [var, varstat[var], piclim[var]['mask'], piclim[var]
File "<string>", line 2, in __getitem__
File "/usr/local/apps/python3/3.8.8-01/lib/python3.8/multiprocessing/managers.py", line 850, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'tos'
To this date all the functions are gathered together in the ecmean.py
This is not very efficient, since implies tons of import and makes also quite complicated to browse the code. We should create a ecmean folder with different py scripts, where functions are clusted as a function of their usage/import
This emerged in the genbil branch, which precomputed the remap weights to speedup the calculation for performance indices.
Currently, the oceinifile chosen is domain_cfg.nc
, since it includes the information to compute the grid area fundamental for averaging operation. However, this file has incomplete grid description so that CDO recognizes the file grid as generic. Therefore, this file cannot be used to generate the weights, and the OCEGRIDFILE variable includes a useless txt file.
An alternative could be to use output from the model itself, which has a curvilenear grid, or the nemo-initial-state.nc
file. However, both them currently does not include grid area information, which would require two different initial files - something we would like to avoid at first.
On the other hand, it might be not obvious how to reproduce the curvilenear description file from the domain_cfg.nc
As noticied by Klaus W, we are lacking in the global mean a measure for the sea ice.
After a long discussion if it is better to implement sea ice area or sea ice extent (see here for the subtle difference) we decided to stick to a version of sea ice area (siconc * gridarea) which is even more simple and does not require the 15% threshold.
It is a bit incomplete but it can be robustly computed by monthly means and it is accepted by the SIMIP community (https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2019GL086749)
Similarly, it should be possible to implement a measure for sea ice volume.
I will work on a branch in the next weeks
All ocean variables (tos, sos, wfo, zos) are missing reference values in gm_reference.yml
I have just introduced a pattern correlation branch https://github.com/oloapinivad/ECmean4/tree/correlation which computes also the correlation for a few selected variables. Currently reference is the PI dataset. Structure is a simplified version of the performance indices script, performing very similar operations. I preferred to keep it separate in order to provide statistics on different diagnostics.
However, we will need to update our datasets
An example run
(.ECmean4) [ccpd@aa6-100 ECmean4]$ ./correlation_pattern.py BETA 1990 1990
Pattern Correlation for tas 0.990241
Pattern Correlation for pr 0.839326
Pattern Correlation for psl 0.967952
Done in 1.4894 seconds
/ec/res4/scratch/ccpd/ece4/ECmean4/table/CP4_RK08_BETA_1990_1990.txt
| Var | Correlation | Domain | Dataset |
|-------+---------------+----------+-----------|
| tas | 0.990241 | land | CRU |
| pr | 0.839326 | global | CMAP |
| psl | 0.967952 | global | COADS |
This is to document the methodology to create a new climatology. A new script has been introduced in devel/clim-cmip6
named py-climatology-create.py
-> https://github.com/oloapinivad/ECmean4/blob/devel/clim-cmip6/py-climatology-create.py and aims at providing a cornerstone for any future climatology development. This is very important since it allows for updating the existing EC22 climatology (or more correctly, the new definitive version of EC23).
The tool is based on a yml file that tells it where the variables are saved on the local machine (now configured to work on wilma). The scripts has been implemented in order to exploit dask so that can run with multiple processors and takes a few hours to cover the 10 variables. It uses CDO for interpolation, with different remapping method according to the variable. It provides both mean and variance to be able to compute the performance indices: for each variable, the default "yearly" climatology is now provided together with a season averaged climatology so that now PI can be produce for multiple seasons.
The script simply compute the yearly or season mean, then estimate interannual variance. An outlier filtering for variance is then applied, but we might not be happy with that (see below). A cool feature is that it automatically produces climatology yml file (and combined with the cmip6 creation file cmip6-clim-evaluate.py
, it also provides the average values for cmip6 for each variable, season and domain) https://github.com/oloapinivad/ECmean4/blob/devel/clim-cmip6/climatology/EC23/pi_climatology_EC23.yml
Following the introduction of the pint package, a small inconsistency in values of the spatial integrals is observed. I think this is correct as it is treated in #12, but I write it down for the sake of my own clarity.
Take for example pr_oce
and pr_land
. They are measured in Sverdrup, i.e. 10^9 m3/s
Precipitation is provided by the model in kg/m2/s, which, dividing by water density (1000 kg/m3) to get volume means 10^-3 m/s (i.e. 1 mm/s or 86400 mm/day). What we currently do is to integrated in space over the ocean or land surface, mutiplying by the area and the summing ('masked_mean()' function in cdopipe.py), so that we multiply by squared meter and get 10^-3 m3/s.
It is thus correct to assume that the factor of conversion is 10^-12 (from original units to target units). ERA5 data suggest we have about 17 Sv of precipitation. However, results of
cdo output -fldsum -mul -timmean -selname,pr BETA_atm_cmip6_1m_1990-1990.nc area.nc
1.69177e+10
Which multiplied by 10^-12 does not give the same order of magnitude.
Indeed, this is taken into account by pint so that we get...
| pr_oce | Precipitation (ocean) | Sv | 0.0135796 | 13.4499 | ERA5 | 1990-2019 |
| pme_oce | Precip. minus evap. (ocean) | Sv | -0.00111626 | -1.24691 | ERA5 | 1990-2019 |
| pr_land | Precipitation (land) | Sv | 0.00333878 | 3.82094 | ERA5 | 1990-2019 |
| pme_land | Precip. minus evap. (land) | Sv | 0.00100797 | 1.39091 | ERA5 | 1990-2019 |
there is a 10^3 factor between older "reference" data and what we currently get. Is this correct? Or was it wrong in the past?
As a further confirmation I found on wiki
Approximately 505,000 cubic kilometres (121,000 cu mi)[citation needed] of water falls as precipitation each year; 398,000 cubic kilometres (95,000 cu mi) of it over the oceans.[
398000 km3/year = 398000 * 10^9 m3/year = 0,012 * 10^9 m3/s (Sv), which is in line with what we get from the new version.
@jhardenberg could you confirm me that this above is correct, and we need to fix reference values (and perhaps change the unit of reference which is not very handy)?
Hi,
When trying to install ECmean4 into a fresh Python virtual environment, I get:
> pip install ECmean4
Collecting ECmean4
Using cached ECmean4-0.1.1-py3-none-any.whl (11.3 MB)
Collecting xarray (from ECmean4)
Using cached xarray-2023.5.0-py3-none-any.whl (994 kB)
Collecting netcdf4 (from ECmean4)
Using cached netCDF4-1.6.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
Collecting dask (from ECmean4)
Using cached dask-2023.6.0-py3-none-any.whl (1.2 MB)
INFO: pip is looking at multiple versions of ecmean4 to determine which version is compatible with other requirements. This could take a while.
Collecting ECmean4
Using cached ECmean4-0.1.0-py3-none-any.whl (11.3 MB)
ERROR: Cannot install ecmean4==0.1.0 and ecmean4==0.1.1 because these package versions have conflicting dependencies.
The conflict is caused by:
ecmean4 0.1.1 depends on esmpy
ecmean4 0.1.0 depends on esmpy
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
Trying to explicitly stating version 0.1.1 does not work either:
pip install 'ECmean4==0.1.1'
Collecting ECmean4==0.1.1
Using cached ECmean4-0.1.1-py3-none-any.whl (11.3 MB)
Collecting xarray (from ECmean4==0.1.1)
Using cached xarray-2023.5.0-py3-none-any.whl (994 kB)
Collecting netcdf4 (from ECmean4==0.1.1)
Using cached netCDF4-1.6.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
Collecting dask (from ECmean4==0.1.1)
Using cached dask-2023.6.0-py3-none-any.whl (1.2 MB)
INFO: pip is looking at multiple versions of ecmean4 to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement esmpy (from ecmean4) (from versions: none)
ERROR: No matching distribution found for esmpy
We are not happy with the current xesmf+esmpy+ESMF dependency, since there are a lot of limitation especially for unstructured grids.
We might want to exploit the CDO bindings to perform this operations: create weights should be straightforward and then later calls can be potentially performed online
The two most important script of ECmean4, or better their two worker functions pi_worker
and gm_worker
shares massive portions of the code (but this is true also for some basic operations of the main as config file loading and setup).
For example browsing for variables and units adjustment are almost identical. It make thus sense to create a set of common functions that are imported and handle these operations to reduce the risk of mistakes, increase modularity and compact the code.
Most importantly, the data access is done in two different methods by the two functions. global_mean
still access the file year by year, while performance_indices
load all the required together. In principle the second option should be more efficient considering the xarray performances, but we need to double check it. This should be uniformed as well.
The docs need further improvements (e.g. usage does not cover CMOR compatibility).
I created a docs2 branch to be used for rolling updates.
The structure that ECmean4 uses for seeking for data is currently made by a series of file handling function as make_input_filenames()
, _filter_filename_by_year()
, _expand_filename()
, etc.
This should be generalized to support for #63, but also to consider cases for example where the year
is not included in the filename structure.
We need to have some testing in place: unit testing ideally, but also a simple integrated test just to make sure that results do not change. I will start exploring this in a testing
branch using pytest.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.