Giter VIP home page Giter VIP logo

eccov4-py's Introduction

Synopsis

ecco_v4_py is a Python package that includes tools for loading and manipulating the ECCO v4 ocean and sea-ice state estimate (http://ecco-group.org)

Extensive documentation is provided on our readthedocs page: http://ecco-v4-python-tutorial.readthedocs.io/index.html#

Installation

Installation instructions can be found here!

https://ecco-v4-python-tutorial.readthedocs.io/Installing_Python_and_Python_Packages.html

Contributors

If you would like to contribute, consider forking this repository and making pull requests via git!

Support

contact [email protected] or Ian.Fenty at jpl.nasa.gov

License

MIT License

Note on version numbers

ecco_v4_py uses the 'semantic versioning' scheme described here:

https://packaging.python.org/guides/distributing-packages-using-setuptools/#semantic-versioning-preferred

The essence of semantic versioning is a 3-part MAJOR.MINOR.MAINTENANCE numbering scheme:

MAJOR version when they make incompatible API changes,

MINOR version when they add functionality in a backwards-compatible manner, and

MAINTENANCE version when they make backwards-compatible bug fixes.

Note on testing with pytest

(credit to Tim Smith)

You can run the tests locally with the pytest package, which is available through conda-forge. With that installed, you can navigate to ECCOv4-py/ecco_v4_py/test and either:

Run all the tests exactly as they are on travis (this takes a while, like 12 minutes!):

py.test . -v --cov=ecco_v4_py --cov-config .coveragerc --ignore=ecco_v4_py/test/test_generate_ecco_netcdf_product.py

Or you can run any individual module e.g. to run the few tests in ecco_utils:

py.test test_ecco_utils.py

(and you can add any of the -v or whatever flags you want).

eccov4-py's People

Contributors

dafyddstephenson avatar duncanbark avatar emmomp avatar ifenty avatar ivanaescobar avatar jetesdal avatar mayadebellis avatar owang01 avatar timothyas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

eccov4-py's Issues

plot_tiles does not seem to be working

When I try, e.g.

ecco_v4_py.plot_tiles(ds.Depth)

I get:

<class 'int'> <class 'int'> <class 'int'>
i= 2 2 3
<class 'int'> <class 'int'> <class 'int'>
i= 3 3 3
<class 'int'> <class 'int'> <class 'int'>
i= 4 0 2
<class 'int'> <class 'int'> <class 'int'>
i= 6 2 2
<class 'int'> <class 'int'> <class 'int'>
i= 7 3 2
<class 'int'> <class 'int'> <class 'int'>
i= 8 0 1
<class 'int'> <class 'int'> <class 'int'>
i= 9 1 1
<class 'int'> <class 'int'> <class 'int'>
i= 10 2 1
<class 'int'> <class 'int'> <class 'int'>
i= 11 3 1
<class 'int'> <class 'int'> <class 'int'>
i= 15 3 0
<class 'int'> <class 'int'> <class 'int'>
i= 16 0 -1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-4177f1d746f7> in <module>
      1 #ds.Depth.plot(col='tile',col_wrap=5,figsize=(16,6),share);
----> 2 ecco.plot_tiles(ds.Depth,Arctic_cap_tile_location=6)

/work/03754/tsmith/ECCOv4-py/ecco_v4_py/tile_plot.py in plot_tiles(tiles, cmap, layout, rotate_to_latlon, Arctic_cap_tile_location, show_colorbar, show_cbar_label, show_tile_labels, cbar_label, fig_size, **kwargs)
    337             #cur_arr[colnum*90:colnump1*90, rownum*90:rownump1*90] = cur_tile
    338             print('i=',i,rownum, colnum)
--> 339             cur_arr[colnum*90:colnump1*90, rownum*90:rownump1*90] = cur_tile
    340             ax.set_aspect('equal')
    341             ax.axis('on')

ValueError: could not broadcast input array from shape (90,90) into shape (0,90)

I probably won't have time to look into it until after ocean sciences. We need an automated test suite so that stuff like this doesn't break.

`get_section_masks` doesn't produce a straight meridional section

I'm trying to use get_section_masks to make a meridional section up the Florida coast. This is deep within the lat/lon part of the grid, so I would have expected specifying the end points as the cell centers at the same longitude would produce a straight meridional section, but instead the path jukes to the side at one point.

The included image shows the problem. The two end points are XC and YC at (i, j) = (73, 47) and (i, j) = (66, 73) on tile 10—these work out to 25.676ºN, 85.5ºW and 31.833ºN, 85.5ºW. The shading is maskC from get_section_masks with blue and red for 0 and 1, respectively, the (i,j) grid is contoured in black at an interval of 10, and the line connecting the two points is also in black.
meridional_section

Anyone have any idea what's happening here? If get_section_masks has a hard time making a meridional section along a grid line in the lat/lon part of the grid, it makes me worry about its reliability in more complex situations.

It's easy enough to make a meridional section by hand in the lat/lon part of the grid, but other sections in other parts of the grid are tricky, so I'd rather use get_section_masks.

Dates are not necessarily in consecutive order when loading files

Following code gives me an xarray dataset with unsorted time points (see output below)

data_dir= ECCO_dir + '/nctiles_monthly_snapshots'

year_start = 1993
year_end = 2017

# load one extra year worth of snapshots
ecco.recursive_load_ecco_var_from_years_nc(data_dir,vars_to_load=['ETAN'],years_to_load=range(year_start, year_end+1))
print(ecco_monthly_snaps.ETAN.time.isel(time=[0, -1]).values)

['2008-01-01T00:00:00.000000000' '1996-01-01T00:00:00.000000000']

I think the problem is here:

files = list(var_path.glob('**/*nc'))

and it probably can be fixed by adding sorted():

files = sorted(list(var_path.glob('**/*nc')))

basemap vs. cartopy

This package shoud migrate to Cartopy and leave Basemap for geographically-aware plotting because Basemap is going away.

Unnecessary dependencies in setup.py

Hello,
thanks for this very nice package!
While trying to install it via poetry (that uses pip internally), I had some issues. After some research, I think this is because some unnecessary deps are defined in the setup files. These deps are:

  • proj (installing via pip tries to install https://pypi.org/project/proj/ , which is not the same as https://proj.org/ packaged in anaconda)
  • geos (installing via pip installs https://pypi.org/project/geos/)
  • pathlib (is part of the python standard lib, and installing from pip tries to use an old unmaintained version from 2014) (although it seems that this one will be removed from the setup.py of the next ecco-v4-py release)

Apart from my troubles installing via poetry (due to pathlib), this installs a lot of useless packages.

I can submit a PR to fix this issue if you want to.

ECCOv4-py/setup.py

Lines 24 to 28 in 3960799

'geos',
'matplotlib',
'netcdf4',
'numpy >= 1.17',
'proj',

netcdf_product_generation.py

netcdf_product_generation.py calls load_ecco_vars_from_mds to load grid and MDS diagnostics output. The order of arguments in calling load_ecco_vars_from_mds is not correct. Specifically, the second and third arguments should be swapped to be consistent with what is defined in load_ecco_vars_from_mds.: the second and third arguments should be the grid directory name and MDS filename, respectively.

duplicative functionality in ECCOv4-py

Thanks again for getting the ball rolling on this important package!

I just had a look through the source code and noticed a few main areas where there is duplicate functionality with other packages:

Given the complexity of these routines, it would be best to just have to write and maintain them in one place.

Python 3 support

Thanks again for getting the ball rolling on this important package! I just had a look through the source code and noticed that it is written only for python 2. This is a big problem, as python 2 will not be maintained past 2020. Nearly all scientific python packages (including numpy and xarray) are dropping python 2 support then or before.
http://python3statement.org/

Since this is a brand new package without any legacy code to support, there is really no reason to write it in python 2. We definitely do not want to encourage new python users (potentially a significant fraction of ecco users) to start from 2 instead of 3!

Comments on latest PR

  1. A small comment, can we default to deleting code rather than commenting it out? This reduces the clutter, and it's not necessary because git keeps track of the changes for us :) e.g.

    #def get_basin_mask(basin_name, mask,
    # basin_path=os.path.join('..','binary_data')):

    #if os.path.exists(os.path.join(bin_dir, 'basins.data')):

    #if os.path.exists(os.path.join(basin_path, 'basins.data')):

  2. Another small one, since we're using os.path.join we might as well use it all the way: e.g. replace

    basin_path = os.path.join(package_directory, '../binary_data'),

    with
    basin_path = os.path.join(package_directory, '..', 'binary_data'),

  3. I don't really understand why the following lines are necessary. Does it have to do with running the tests in parallel? If they're just debug statements then I think they should be removed:

    global mds_dir_info
    if mds_dir_info == None:
    mds_dir_info = setup_mds_dir(tmpdir_factory,request, _experiments)
    print('-------- made mds_dirr_info ')
    print(mds_dir_info)
    return mds_dir_info

global llc_dir_info
if llc_dir_info == None:
dirname, expected = llc_mds_datadirs
llc_dir_info = [dirname, expected]
else:
dirname = llc_dir_info[0]
expected = llc_dir_info[1]

global llc_dir_info
if llc_dir_info == None:
dirname, expected = llc_mds_datadirs
llc_dir_info = [dirname, expected]
else:
dirname = llc_dir_info[0]
expected = llc_dir_info[1]

read_bin_gen.py not working

read_bin_gen.py broke after a recent update of introducing the "pathlib" package.

The error occurs at line 80:
f = open(datafile, 'rb')
with the following error message:
TypeError: coercing to Unicode: need string or buffer, PosixPath found

The error is probably because "datafile" is not a string. A possible fix would be to convert "datafile" to a string, like
f = open(str(datafile), 'rb')

Installation guide issues

Hey @ifenty thanks so much for this python package and the great documentation! I recently found it and have been going through the examples. It's a great start for breaking my MATLAB addiction ...

I just wanted to make a couple notes:

  • basemap can no longer be installed via pip. Instead, users are directed via the basemap github readme to download a tar ball from the basemap release page

  • on their website they recommend building geos-3.3, but this was a pain. I had a much easier time using the mac package manager Homebrew. The command brew install geos did the trick.

  • I had trouble downloading pyresample via pip on my mac because of an openmp requirement. However conda took care of it with the command:conda install -c conda-forge pyresample

Generic behavior with ASTE / global llc grids

This repository is currently designed to work with global llc grids, and by default the global llc90 grid as used in ECCOv4. However, it would be even more useful if we made it flexible to include various regional configurations with the LLC configuration, e.g. ASTE. This would allow us to merge in @raphaeldussin's MITgcm-recipes repo. For instance his regridding, open boundary routines, etc are a huge service to the community that I'm sure a lot of folks would benefit from.

Thanks to the flexibility of xmitgcm and xgcm, I think the only functions that need changing are:

  • ecco_utils.get_llc_grid: this defines the xgcm grid object with all of the LLC connections (note that in ECCO terms, the xgcm "face" dimension is renamed to "tile", because "face" refers to something specific in the gcmfaces package). @raphaeldussin has done this here for ASTE
  • read_bin_llc.load_ecco_vars_from_mds: this is basically a wrapper around xmitgcm.open_mdsdataset, but specifies some nice things like the reference date, time step, etc... and renames the face-> tile dimension. Also, one can load specific tiles and change the time step location based on time averaging etc.
  • the llc_array_conversion routines may need some fixing, but I don't think they're critical. In fact, the convert_tiles_to_xda function doesn't need to be changed, only the routines which convert between faces, compact, and tiles formats.

Specifically moving forward with ASTE, I think that we should make some of these changes by adding a flag such as "model_config" which has by default model_config='global', but could be changed to e.g. model_config='aste' to set up these different configurations.

I am happy to help start making these merges in my spare time. But I think @lauren-moseley would benefit a lot from helping out with this because you'll be working with ASTE - I haven't done anything with ASTE :) I suggest @lauren-moseley because I can imagine that @raphaeldussin is pretty swamped at the moment with job transition.

Where is 'basins.data'?

Can someone tell me where the file 'basins.data' can be found? It doesn't come with this package and it's not clear from the code where it comes from. Maybe we can add a comment about this? Thanks!

file paths

In several I/O routines we assume that the user is on a *nix machine where the forward slash '/' separates directories and filenames. This assumption is made when construction full filepaths to load. For example, in load_binary_array.py we have this line:
datafile = fdir + '/' + fname

It turns out that on windows machines this method of constructing filepaths fails because windows machines use backslashes.

It looks like a good solution for Python 3.4 is a library called pathlib. This library allows easy construction of filepaths independent of operating system. See this blog post.

https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f

So, we have to go through and replace a bunch of hard coded "path + '/' + filename" with the syntax offered by pathlib.

Reorient/rotate tools

@ifenty and @timothyas

Possibly relevant to the reorient/regrid u,v discussion -- I came across this https://github.com/pletzer/mint by Alex Pletzer (now in New Zealand). Oliver and I are taking a look to see if it can do local divergence preserving grid transformations. In theory line integral approaches look like they can do this, but until now everyone seems to have been wary of the coding involved when they hear the details.

Not sure how it handles bathymetry (Alex examples look atmospheric).

import ecco_v4_py as ecco not working

Will return:


AttributeError Traceback (most recent call last)
in ()
----> 1 import ecco_v4_py as ecco

8 frames
/usr/lib/python3.6/inspect.py in cleandoc(doc)
618 onwards is removed."""
619 try:
--> 620 lines = doc.expandtabs().split('\n')
621 except UnicodeError:
622 return None

AttributeError: 'NoneType' object has no attribute 'expandtabs'

Doc missing for interpolating ECCO vectors to lat-lon grids?

Thanks to all the developers for the great work on ECCOv4-py! I'm trying to locate info on how ECCO vector fields have been interpolated on to lat-lon grids for variables such as EVEL* and NVEL*. I had an old link to:

https://ecco-v4-python-tutorial.readthedocs.io/ECCO_v4_Interpolating_Fields_to_Lat-Lon_Grid.html#Interpolating-ECCO-vectors-fields-to-lat-lon-grids

but now it displays <no title> in my browser. Am I missing an updated link or has this doc page been removed? Thanks -- and apologies if this was buried in a previous issue.

How to make a grid nc file like the "ECCO-GRID.nc" for LLC 270

The ECCO-GRID.nc is required for many functions in ECCOv4-py. My question is how to make a similar grid file for LLC270. My MITgcm simulation will output more than 300 grid*.nc files due to the CPU number. How to glue these grid*.nc files into one file like the ECCO-GRID.nc?

this package needs tests

This package contains lots of useful routines for dealing with ecco output. Fantastic work @ifenty for such a valuable contribution! 🥇

As we discussed at OSM, a test suite is very important for a community open source package. Matt Rocklin sums them up in this blog post better than I ever could.

Among the many advantages, the exercise of writing tests would force the ecco community to think precisely about what features the ECCOv4-py package needs to have in order to serve its users.

For an example of what such a test suite might look like, you can have a look at xmitgcm:
https://github.com/xgcm/xmitgcm/blob/master/xmitgcm/test/test_mds_store.py

Where to save binary data files

This package requires binary_data/basins.data for the get_basin module. Also, the (currently limited) test suite requires some binary data to test array conversion. As addressed by @ifenty in #70, keeping this directory as bin/ was not a good idea because installing the package through pip install can result in these data files getting written to the installing user's $HOME/bin directory.

I just tried installing the package through pip: it recognized/installed in my anaconda directory and installed as follows:

anaconda3/lib/python3.6/site-packages/ecco_v4_py
anaconda3/ecco_v4_py_binary_data/basins.(meta/data)

This would break get_basin.py too, since it expects the basins.data file to be in ../binary_data/basins.data (along with the tests). I am now understanding #59 much better. I am curious @cspencerjones - where did you end up finding basins.data?

TL; DR we need a consistent method for storing the few data files we need. Perhaps figshare is the way to go, and I'll start by checking this out. Anyone have any experience with doing this in a standard way?

Use resample_to_latlon for 3D dataset

Hi! Is there a way to use resample_to_latlon on a 3D dataset? The function works if I use a 2D dataset, e.g. (based on the documentation):

new_grid_lon, new_grid_lat, field_nearest_1deg =\
        ecco.resample_to_latlon(ecco_ds.XC, \
                                ecco_ds.YC, \
                                ecco_ds.SALT.isel(time=0,k=0),\
                                new_grid_min_lat, new_grid_max_lat, new_grid_delta_lat,\
                                new_grid_min_lon, new_grid_max_lon, new_grid_delta_lon,\
                                fill_value = np.NaN, \
                                mapping_method = 'nearest_neighbor',
                                radius_of_influence = 120000)

Unfortunately it doesn't work if I try to apply it across depth, e.g.:

new_grid_lon, new_grid_lat, field_nearest_1deg =\
        ecco.resample_to_latlon(ecco_ds.XC, \
                                ecco_ds.YC, \
                                ecco_ds.SALT.isel(time=0),\ #without selecting a specific k
                                new_grid_min_lat, new_grid_max_lat, new_grid_delta_lat,\
                                new_grid_min_lon, new_grid_max_lon, new_grid_delta_lon,\
                                fill_value = np.NaN, \
                                mapping_method = 'nearest_neighbor',
                                radius_of_influence = 120000)

I would appreciate any insights on this. Thank you so much!

Feature request: generalize 'extract_yyyy_mm_dd_hh_mm_ss_from_datetime64' to accept array arguments

Hello,

At present, the 'extract_yyyy_mm_dd_hh_mm_ss_from_datetime64' function only accepts scalar arguments. If you give it an array of datetime64 values, it returns an error. Here is some code that illustrates the error:

# import  modules
import numpy as np
import xarray as xr
import xgcm
import sys

# import ecco_v4_py
sys.path.append('/Users/USERNAME/ECCOv4-py/ECCOv4-py')
import ecco_v4_py as ecco

# define main directory
base_dir = '/Users/USERNAME/ECCOv4-py/Version4/'
# define ECCO version
ecco_version = 'v4r4'
# define a high-level directory for ECCO fields
ECCO_dir = base_dir + '/Release4'
# define data directory
data_dir= ECCO_dir + '/nctiles_monthly'

# load monthly mean files
ds = ecco.recursive_load_ecco_var_from_years_nc(data_dir, \
                                                vars_to_load=['THETA'],\
                                                years_to_load=range(1993, 2017))

# extract xarray DataArray (an array of datetime64[ns] values)
tds = ds.time

Running the 'extract' function on a single scalar datetime64[ns] works fine:

ecco.extract_yyyy_mm_dd_hh_mm_ss_from_datetime64(ds.time[-1].values)
(2016, 12, 16, 12, 0, 0)

But giving the same function an array of datetime64[ns] values as follows

ecco.extract_yyyy_mm_dd_hh_mm_ss_from_datetime64(ds.time.values)

produces this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-8faeaf0a92b0> in <module>()
----> 1 ecco.extract_yyyy_mm_dd_hh_mm_ss_from_datetime64(ds.time.values)

/Users/USERNAME/ECCOv4-py/ECCOv4-py/ecco_v4_py/ecco_utils.py in extract_yyyy_mm_dd_hh_mm_ss_from_datetime64(dt64)
    189 
    190     s = str(dt64)
--> 191     year = int(s[0:4])
    192     mon = int(s[5:7])
    193     day = int(s[8:10])

ValueError: invalid literal for int() with base 10: "['19"

Could the function extract_yyyy_mm_dd_hh_mm_ss_from_datetime64 in ecco_utils be generalized to accept array arguments as well?

How to get a contour plot of a different field on top of a contourf plot?

Hello,

I am doing something like this -

P = ecco.plot_proj_to_latlon_grid(ds.XC, ds.YC,
                                      field,
                                      plot_type = 'contourf',
                                      show_colorbar=True, cmap=cmocean.cm.balance, 
                                      cmin = -1, cmax = 1,
                                      user_lon_0 = -150,
                                      dx=2, dy=2, projection_type = 'robin',
                                      less_output = True)

I want to understand how one can introduce a contour plot on top of this of some other field, say field_new. I appreciate any help in this regard!

Thanks,
Shreyas

Errors thrown when 'tile' dim is named 'face' by xmitgcm.open_mdsdataset

Hi all, I'm not sure whether to raise this with ECCOv4-py or xmitgcm but when opening an MDS dataset (produced by re-running ECCOv4r4) with xmitgcm, the dimension tile which ECCOv4-py is looking for in many of its functions is named face, resulting in a range of errors. I have a clunky way around using xarray's rename function but I thought I'd bring this compatibility clash up in case somebody wanted to put something more permanent in the module to handle this. If this is more an issue for xmitgcm then apologies, I can take it over there instead. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.