Please see https://gallery.pangeo.io/ for more up-to-date content
pangeo-data / pangeo-example-notebooks Goto Github PK
View Code? Open in Web Editor NEWPangeo Example Notebooks
Pangeo Example Notebooks
Please see https://gallery.pangeo.io/ for more up-to-date content
Hello, I have upgraded from 0.1.1-86665a6 to 19.03.05 in doing so I notice the example notebooks are gone. Specifically the dask-array.ipynb was very useful, since it allows for a easy, lightweight test of pangeo. Since the upgrade running this notebook errors as per below. Are you able to point me in the direction of how to fix it? Thanks
from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=10)
cluster
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-1-00757e3889c1> in <module>
1 from dask_kubernetes import KubeCluster
----> 2 cluster = KubeCluster(n_workers=10)
3 cluster
/srv/conda/lib/python3.6/site-packages/dask_kubernetes/core.py in __init__(self, pod_template, name, namespace, n_workers, host, port, env, **kwargs)
178 msg = ("Worker pod specification not provided. See KubeCluster "
179 "docstring for ways to specify workers")
--> 180 raise ValueError(msg)
181
182 self.cluster = LocalCluster(ip=host or socket.gethostname(),
ValueError: Worker pod specification not provided. See KubeCluster docstring for ways to specify workers
@rabernat is frustrated that the hvplot
plots in the sea level example notebook are taking much longer than the original matplotlib plots (like the first plot takes 20 seconds instead of 5 seconds).
I tried to build https://github.com/rsignell-usgs/EarthAnnotator/tree/patch-1 on http://binder.pangeo.io and it failed with
ValueError:
"@jupyter-widgets/[email protected]" is not compatible with the current JupyterLab
because jupyter labextension install @jupyter-widgets/jupyterlab-manager
is trying to install 0.38.1, but jupyterlab is at 0.34.
For compatibility we need these version pairings:
https://github.com/jupyter-widgets/ipywidgets/tree/master/packages/jupyterlab-manager#version
So we need to either update JupyterLab to 0.35 on http://binder.pangeo.io or specify
jupyter labextension install @jupyter-widgets/[email protected]
Note: It builds fine on mybinder.org
Currently the MOM6 example by @gustavo-marques shows maps like this
@adcroft correctly pointed out that these are not the correct coordinates. We should be creating an irregular QuadMesh. This is apparently possible with holoviews / geoviews
This example from geoviews in particular looks like what we want.
However, the current MOM6 zarr dataset doesn't even have the geographic coordinates in it.
<xarray.Dataset>
Dimensions: (Time: 7300, nv: 2, xh: 540, xq: 540, yh: 458, yq: 458)
Coordinates:
* Time (Time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
* nv (nv) float64 1.0 2.0
* xh (xh) float64 -286.7 -286.0 -285.3 -284.7 -284.0 -283.3 ...
* xq (xq) float64 -286.3 -285.7 -285.0 -284.3 -283.7 -283.0 ...
* yh (yh) float64 -79.2 -79.08 -78.95 -78.82 -78.69 -78.56 -78.43 ...
* yq (yq) float64 -79.14 -79.01 -78.89 -78.76 -78.63 -78.49 ...
Data variables:
SSH (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
SSS (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
SST (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
Time_bnds (Time, nv) float64 dask.array<shape=(7300, 2), chunksize=(7300, 2)>
average_DT (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
average_T1 (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
average_T2 (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
ePBL_h_ML (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
Attributes:
associated_files: area_t: g.c2b6.GNYF.T62_t061.control.004.mom6.static.nc
filename: g.c2b6.GNYF.T62_t061.control.004.mom6.sfc.day._0001_00...
grid_tile: N/A
grid_type: regular
title: MOM6 g.c2b6.GNYF.T62_t061.control.004 Experiment
So before we can fix the plot, we need to fix the data. @gustavo-marques, could you add the lat / lon coordinates? You can just upload the directories to the existing zarr group and it should "just work". (A lot easier than adding a variable to a netCDF file!)
This is what I get on the latest image of this repo
Fetching package metadata .................
Solving package specifications:
An unexpected error has occurred.
Please consider posting the following information to the
conda GitHub issue tracker at:
https://github.com/conda/conda/issues
Current conda install:
platform : linux-64
conda version : 4.3.30
conda is private : False
conda-env version : 4.3.30
conda-build version : not installed
python version : 3.6.4.final.0
requests version : 2.18.4
root environment : /srv/conda (writable)
default environment : /srv/conda
envs directories : /srv/conda/envs
/home/jovyan/.conda/envs
package cache : /srv/conda/pkgs
/home/jovyan/.conda/pkgs
channel URLs : https://conda.anaconda.org/conda-forge/linux-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.continuum.io/pkgs/main/linux-64
https://repo.continuum.io/pkgs/main/noarch
https://repo.continuum.io/pkgs/free/linux-64
https://repo.continuum.io/pkgs/free/noarch
https://repo.continuum.io/pkgs/r/linux-64
https://repo.continuum.io/pkgs/r/noarch
https://repo.continuum.io/pkgs/pro/linux-64
https://repo.continuum.io/pkgs/pro/noarch
config file : /srv/conda/.condarc
netrc file : None
offline mode : False
user-agent : conda/4.3.30 requests/2.18.4 CPython/3.6.4 Linux/4.4.111+ debian/stretch/sid glibc/2.26
UID:GID : 1000:1000
`$ /srv/conda/bin/conda-env update -v -n root -f binder/environment.yml`
Traceback (most recent call last):
File "/srv/conda/lib/python3.6/site-packages/conda/exceptions.py", line 640, in conda_exception_handler
return_value = func(*args, **kwargs)
File "/srv/conda/lib/python3.6/site-packages/conda_env/cli/main_update.py", line 106, in execute
installer.install(prefix, specs, args, env, prune=args.prune)
File "/srv/conda/lib/python3.6/site-packages/conda_env/installers/conda.py", line 32, in install
channel_priority_map=_channel_priority_map)
File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 489, in install_actions_list
for specs_by_prefix in required_solves]
File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 489, in <listcomp>
for specs_by_prefix in required_solves]
File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 631, in get_actions_for_dists
pkgs = r.install(specs, installed, update_deps=update_deps)
File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 813, in install
pkgs = self.solve(specs, returnall=returnall)
File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 853, in solve
reduced_index = self.get_reduced_index(specs)
File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 374, in get_reduced_index
specs, features = self.verify_specs(specs)
File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 300, in verify_specs
raise ResolvePackageNotFound(bad_deps)
conda.exceptions.ResolvePackageNotFound:
- dask-kubernetes
- python-kubernetes 4
Removing intermediate container 34826b18a83f
The command '/bin/sh -c conda env update -v -n root -f "binder/environment.yml" && conda clean -tipsy' returned a non-zero code: 1
Tried to run the the cell
from sklearn.externals import joblib
with joblib.parallel_backend('dask', scatter=[X, y]):
grid_search.fit(X, y)
and got the output (it's long...)
Possibly the RuntimeError: Joblib backend requires either
joblib>= '0.10.2' or
sklearn > '0.17.1'. Please install or upgrade
is the main issue?
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-8-085d2322fa37> in <module>()
2
3 with joblib.parallel_backend('dask', scatter=[X, y]):
----> 4 grid_search.fit(X, y)
/opt/conda/lib/python3.6/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
637 error_score=self.error_score)
638 for parameters, (train, test) in product(candidate_params,
--> 639 cv.split(X, y, groups)))
640
641 # if one choose to see train score, "out" will contain train score info
/opt/conda/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable)
787 # consumption.
788 self._iterating = False
--> 789 self.retrieve()
790 # Make sure that we get a last message telling us we are done
791 elapsed_time = time.time() - self._start_time
/opt/conda/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in retrieve(self)
699 self._output.extend(job.get(timeout=self.timeout))
700 else:
--> 701 self._output.extend(job.get())
702
703 except BaseException as exception:
/opt/conda/lib/python3.6/site-packages/distributed/joblib.py in get()
249
250 def get():
--> 251 return ref().result()
252
253 future.get = get # monkey patch to achieve AsyncResult API
/opt/conda/lib/python3.6/site-packages/distributed/client.py in result(self, timeout)
190 raiseit=False)
191 if self.status == 'error':
--> 192 six.reraise(*result)
193 elif self.status == 'cancelled':
194 raise result
/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
690 value = tp()
691 if value.__traceback__ is not tb:
--> 692 raise value.with_traceback(tb)
693 raise value
694 finally:
/opt/conda/lib/python3.6/site-packages/distributed/protocol/pickle.py in loads()
57 def loads(x):
58 try:
---> 59 return pickle.loads(x)
60 except Exception:
61 logger.info("Failed to deserialize %s", x[:10000], exc_info=True)
/opt/conda/lib/python3.6/site-packages/distributed/joblib.py in <module>()
38 _bases.append(ParallelBackendBase)
39 if not _bases:
---> 40 raise RuntimeError("Joblib backend requires either `joblib` >= '0.10.2' "
41 " or `sklearn` > '0.17.1'. Please install or upgrade")
42
RuntimeError: Joblib backend requires either `joblib` >= '0.10.2' or `sklearn` > '0.17.1'. Please install or upgrade
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d45c7b8>, <Future finished exception=CancelledError(['_fit_and_score-batch-c8bc3da59762435bb023dded3c77fb1c'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c8bc3da59762435bb023dded3c77fb1c']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d459f28>, <Future finished exception=CancelledError(['_fit_and_score-batch-c4ce3d7618034bec8f259a15b9b99b3f'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c4ce3d7618034bec8f259a15b9b99b3f']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6e527620>, <Future finished exception=CancelledError(['_fit_and_score-batch-4ca1e7b762c44a0d930e15f6c6a981f9'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-4ca1e7b762c44a0d930e15f6c6a981f9']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6edb52f0>, <Future finished exception=CancelledError(['_fit_and_score-batch-29b5dd78588d448a8eb6e33d0d7400ca'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-29b5dd78588d448a8eb6e33d0d7400ca']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6fddf950>, <Future finished exception=CancelledError(['_fit_and_score-batch-c0c51b4512904a449c9cd169b95b749e'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c0c51b4512904a449c9cd169b95b749e']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6edb11e0>, <Future finished exception=CancelledError(['_fit_and_score-batch-50ac41eee8364dcbb7b42e46ef9b0912'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-50ac41eee8364dcbb7b42e46ef9b0912']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6ed93378>, <Future finished exception=CancelledError(['_fit_and_score-batch-c20e4a9fc8654ae290286dbe6fab8c14'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c20e4a9fc8654ae290286dbe6fab8c14']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d45e048>, <Future finished exception=CancelledError(['_fit_and_score-batch-eea80eb9ac67456abbc3f6ab66742105'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-eea80eb9ac67456abbc3f6ab66742105']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6e527e18>, <Future finished exception=CancelledError(['_fit_and_score-batch-f9de1c20b4034245968ae293f0296956'],)>)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
ret = callback()
File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
return fn(*args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
result = yield _wait([future])
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-f9de1c20b4034245968ae293f0296956']
running the notebook, on the line
elevation = ds['elevation'].isel(ensemble=0).persist()
It seem like .isel(ensemble=0)
is unnecessary as ds['elevation']
does not seem to have the ensemble dimension (and thus raise an error, at least for me and pangeo.pydata.org.
>>> ds['elevation']
<xarray.DataArray 'elevation' (lat: 224, lon: 464)>
dask.array<shape=(224, 464), dtype=float64, chunksize=(224, 464)>
Coordinates:
* lat (lat) float64 25.06 25.19 25.31 25.44 25.56 25.69 25.81 25.94 ...
* lon (lon) float64 -124.9 -124.8 -124.7 -124.6 -124.4 -124.3 -124.2 ...
Attributes:
long_name: Terrain Elevation
standard_name: elevation
units: meters
I am trying out these examples with the pangeo binder.
In pangeo.pydata.org, the following code works:
ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/SOSE')
But in hub.binder.pangeo.io, it fails with
_call exception: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8d48ca4ac8>: Failed to establish a new connection: [Errno 110] Connection timed out',))
Traceback (most recent call last):
File "/srv/conda/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/srv/conda/lib/python3.6/site-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/srv/conda/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
I find I need to do
ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/SOSE', gcs=gcsfs.GCSFileSystem(token='anon')))
which is significantly uglier and more complicated.
Can we somehow make anonymous tokens the default for gcsfs?
Executing line
km = dask_ml.cluster.KMeans(n_clusters=3, init_max_iter=2, oversampling_factor=10)
km.fit(X)
results in
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-14-7d3bc4c475f7> in <module>()
1 km = dask_ml.cluster.KMeans(n_clusters=3, init_max_iter=2, oversampling_factor=10)
----> 2 km.fit(X)
/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in fit(self, X, y)
197 max_iter=self.max_iter,
198 init_max_iter=self.init_max_iter,
--> 199 tol=self.tol,
200 )
201 self.cluster_centers_ = centroids
/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in k_means(X, n_clusters, init, precompute_distances, n_init, max_iter, verbose, tol, random_state, copy_x, n_jobs, algorithm, return_n_iter, oversampling_factor, init_max_iter)
268 random_state=random_state,
269 oversampling_factor=oversampling_factor,
--> 270 init_max_iter=init_max_iter,
271 )
272 if return_n_iter:
/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in _kmeans_single_lloyd(X, n_clusters, max_iter, init, verbose, x_squared_norms, random_state, tol, precompute_distances, oversampling_factor, init_max_iter)
550 counts = da.maximum(counts, 1)
551 new_centers = new_centers / counts[:, None]
--> 552 new_centers, = compute(new_centers)
553
554 # Convergence check
/opt/conda/lib/python3.6/site-packages/dask/base.py in compute(*args, **kwargs)
400 keys = [x.__dask_keys__() for x in collections]
401 postcomputes = [x.__dask_postcompute__() for x in collections]
--> 402 results = schedule(dsk, keys, **kwargs)
403 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
404
/opt/conda/lib/python3.6/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, **kwargs)
2191 try:
2192 results = self.gather(packed, asynchronous=asynchronous,
-> 2193 direct=direct)
2194 finally:
2195 for f in futures.values():
/opt/conda/lib/python3.6/site-packages/distributed/client.py in gather(self, futures, errors, maxsize, direct, asynchronous)
1566 return self.sync(self._gather, futures, errors=errors,
1567 direct=direct, local_worker=local_worker,
-> 1568 asynchronous=asynchronous)
1569
1570 @gen.coroutine
/opt/conda/lib/python3.6/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
651 return future
652 else:
--> 653 return sync(self.loop, func, *args, **kwargs)
654
655 def __repr__(self):
/opt/conda/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
275 e.wait(10)
276 if error[0]:
--> 277 six.reraise(*error[0])
278 else:
279 return result[0]
/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
691 if value.__traceback__ is not tb:
692 raise value.with_traceback(tb)
--> 693 raise value
694 finally:
695 value = None
/opt/conda/lib/python3.6/site-packages/distributed/utils.py in f()
260 if timeout is not None:
261 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262 result[0] = yield future
263 except Exception as exc:
264 error[0] = sys.exc_info()
/opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self)
1097
1098 try:
-> 1099 value = future.result()
1100 except Exception:
1101 self.had_exception = True
/opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self)
1105 if exc_info is not None:
1106 try:
-> 1107 yielded = self.gen.throw(*exc_info)
1108 finally:
1109 # Break up a reference to itself
/opt/conda/lib/python3.6/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
1445 six.reraise(type(exception),
1446 exception,
-> 1447 traceback)
1448 if errors == 'skip':
1449 bad_keys.add(key)
/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
690 value = tp()
691 if value.__traceback__ is not tb:
--> 692 raise value.with_traceback(tb)
693 raise value
694 finally:
/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in _compile_for_args()
366 e.patch_message(''.join(e.args) + help_msg)
367 # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 368 raise e
369
370 def inspect_llvm(self, signature=None):
/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in _compile_for_args()
323 argtypes.append(self.typeof_pyval(a))
324 try:
--> 325 return self.compile(tuple(argtypes))
326 except errors.TypingError as e:
327 # Intercept typing error that may be due to an argument
/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in compile()
651
652 self._cache_misses[sig] += 1
--> 653 cres = self._compiler.compile(args, return_type)
654 self.add_overload(cres)
655 self._cache.save_overload(sig, cres)
/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in compile()
81 args=args, return_type=return_type,
82 flags=flags, locals=self.locals,
---> 83 pipeline_class=self.pipeline_class)
84 # Check typing error if object mode is used
85 if cres.typing_error is not None and not flags.enable_pyobject:
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in compile_extra()
871 pipeline = pipeline_class(typingctx, targetctx, library,
872 args, return_type, flags, locals)
--> 873 return pipeline.compile_extra(func)
874
875
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in compile_extra()
365 self.lifted = ()
366 self.lifted_from = None
--> 367 return self._compile_bytecode()
368
369 def compile_ir(self, func_ir, lifted=(), lifted_from=None):
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in _compile_bytecode()
802 """
803 assert self.func_ir is None
--> 804 return self._compile_core()
805
806 def _compile_ir(self):
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in _compile_core()
789 self.define_pipelines(pm)
790 pm.finalize()
--> 791 res = pm.run(self.status)
792 if res is not None:
793 # Early pipeline completion
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in run()
251 # No more fallback pipelines?
252 if is_final_pipeline:
--> 253 raise patched_exception
254 # Go to next fallback pipeline
255 else:
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in run()
243 try:
244 event(stage_name)
--> 245 stage()
246 except _EarlyPipelineCompletion as e:
247 return e.result
/opt/conda/lib/python3.6/site-packages/numba/compiler.py in stage_generic_rewrites()
478 with self.fallback_context(msg):
479 rewrites.rewrite_registry.apply('before-inference',
--> 480 self, self.func_ir)
481
482 def stage_nopython_rewrites(self):
/opt/conda/lib/python3.6/site-packages/numba/rewrites/registry.py in apply()
68 key, block = work_list.pop()
69 matches = rewrite.match(func_ir, block, pipeline.typemap,
---> 70 pipeline.calltypes)
71 if matches:
72 if config.DEBUG or config.DUMP_IR:
/opt/conda/lib/python3.6/site-packages/numba/rewrites/static_getitem.py in match()
19 if expr.op == 'getitem':
20 try:
---> 21 const = func_ir.infer_constant(expr.index)
22 except errors.ConstantInferenceError:
23 continue
/opt/conda/lib/python3.6/site-packages/numba/ir.py in infer_constant()
962 if isinstance(name, Var):
963 name = name.name
--> 964 return self._consts.infer_constant(name)
965
966 def get_definition(self, value, lhs_only=False):
/opt/conda/lib/python3.6/site-packages/numba/consts.py in infer_constant()
32 if name not in self._cache:
33 try:
---> 34 self._cache[name] = (True, self._do_infer(name))
35 except ConstantInferenceError as exc:
36 # Store the exception args only, to avoid keeping
/opt/conda/lib/python3.6/site-packages/numba/consts.py in _do_infer()
58 "no single definition for %r" % (name,))
59 try:
---> 60 const = defn.infer_constant()
61 except ConstantInferenceError:
62 if isinstance(defn, ir.Expr):
/opt/conda/lib/python3.6/site-packages/numba/ir.py in infer_constant()
344
345 def infer_constant(self):
--> 346 raise ConstantInferenceError('%s' % self, loc=self.loc)
347
348
/opt/conda/lib/python3.6/site-packages/numba/errors.py in __init__()
526 self.value = value
527 msg = "Cannot make a constant from: %s" % value
--> 528 super(ConstantInferenceError, self).__init__(msg, loc=loc)
529
530
/opt/conda/lib/python3.6/site-packages/numba/errors.py in __init__()
386 if loc:
387 super(NumbaError, self).__init__(
--> 388 highlight("%s\n%s\n" % (msg, loc.strformat())))
389 else:
390 super(NumbaError, self).__init__(highlight("%s" % (msg,)))
/opt/conda/lib/python3.6/site-packages/numba/ir.py in strformat()
85
86 ret.extend(selected[:-1])
---> 87 ret.append(_termcolor.highlight(selected[-1]))
88
89 # point at the problem with a caret
IndexError: Failed at nopython (nopython rewrites)
list index out of range
As we mentioned in our landsat8 demo blog post (https://medium.com/pangeo/cloud-native-geoprocessing-of-earth-observation-satellite-data-with-pangeo-997692d91ca2), there is still much room for improvement.
Here is a nice benchmarking analysis of reading cloud-optimized-geotiffs (COGs) on AWS: https://github.com/opendatacube/benchmark-rio-s3/blob/master/report.md#rasterio-configuration
And discussion of the report here:
http://osgeo-org.1560.x6.nabble.com/Re-Fwd-Cloud-optimized-GeoTIFF-configuration-specifics-SEC-UNOFFICIAL-tt5367948.html
It would be great to do similar benchmarking with our example, and see if there are simple ways to improve how COGs are read with the combination of xarray, dask, and rasterio.
Pinging some notebook authors on this one, @mrocklin, @jhamman, @rsignell-usgs, @darothen !
Same issue as pangeo-data/pangeo-tutorial#14. Just opening it here too to remember fixing it there also.
It seems the Docker image built from this repo does not have rasterio/gdal correctly installed. This result with an error when importing rasterio, saying that libgdal is missing.
I believe that the fix may not be too hard, I encountered this today when running a tutorial at a french lab. Did not have time to investigate yet, we may just need to update the repotodocker build, or point to common Pangeo images.
In this Notebook, it looks like a cluster cannot be set up using cluster = KubeCluster(n_workers=10)
any more. Instead, one should follow the Dask Kubernetes documentation and use a worker specification YAML file.
However, when I execute the following on pangeo's binder:
cluster = KubeCluster.from_yaml('worker-spec.yml')
cluster.scale_up(10)
client = Client(cluster)
client
I cannot open the dashboard link (site 10.48.180.27 can't be reached). Any idea why?
ValueError Traceback (most recent call last)
in
2 from dask_gateway import Gateway
3
----> 4 gateway = Gateway()
5 cluster = gateway.new_cluster()
6 cluster.scale(30)
~/opt/anaconda3/lib/python3.7/site-packages/dask_gateway/client.py in init(self, address, proxy_address, auth, asynchronous, loop)
266 if address is None:
267 raise ValueError(
--> 268 "No dask-gateway address provided or found in configuration"
269 )
270 address = address.rstrip("/")
ValueError: No dask-gateway address provided or found in configuration
@scottyhq and @rabernat, partly to bone up on my hvplot
/pyviz
skills, I replaced all the plots in the Landsat NDVI notebook with hvplot
interactive graphics here.
Not sure whether you want this as a PR or how best to handle it, but just wanted to let you know since it's pretty cool. For, example, when live, you can zoom on one of the small multiple plots, and all the plots zoom together:
Thanks to the pangeo binder instance, you can also run the notebook live:
running the line in the Landsat8-cog-ndvi-hvplot.iynb
display(band1.hvplot(rasterize=True, width=600, height=400, cmap='viridis'))
results:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/utils.py in __len__(self)
501 try:
--> 502 return self.shape[0]
503 except IndexError:
IndexError: tuple index out of range
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<timed eval> in <module>
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/plotting/core.py in __call__(self, x, y, kind, **kwds)
70 return pn.panel(plot, **panel_dict)
71
---> 72 return self._get_converter(x, y, kind, **kwds)(kind, x, y)
73
74 def _get_converter(self, x=None, y=None, kind=None, **kwds):
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/converter.py in __call__(self, kind, x, y)
942 obj = DynamicMap(cbcallable, streams=[self.stream])
943 else:
--> 944 obj = method(x, y)
945
946 if self.crs and self.project:
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/converter.py in rgb(self, x, y, z, data)
1586 z = list(data.data_vars)[0]
1587 data = data[z]
-> 1588 nbands = len(data.coords[bands])
1589 if nbands < 3:
1590 raise ValueError('Selected bands coordinate (%s) has only %d channels,'
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/dataarray.py in __len__(self)
541
542 def __len__(self) -> int:
--> 543 return len(self.variable)
544
545 @property
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/utils.py in __len__(self)
502 return self.shape[0]
503 except IndexError:
--> 504 raise TypeError("len() of unsized object")
505
506
TypeError: len() of unsized object
and package versions:
# Print package versions
print('xarray version: ', xr.__version__)
print('rasterio version: ', rasterio.__version__)
print('hvplot version: ', hvplot.__version__)
xarray version: 0.15.1
rasterio version: 1.1.3
hvplot version: 0.5.2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.