Giter VIP home page Giter VIP logo

pangeo-example-notebooks's Introduction

pangeo-example-notebooks's People

Contributors

ahuang11 avatar martindurant avatar rabernat avatar rsignell-usgs avatar scottyhq avatar tomaugspurger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pangeo-example-notebooks's Issues

pangeo test example

Hello, I have upgraded from 0.1.1-86665a6 to 19.03.05 in doing so I notice the example notebooks are gone. Specifically the dask-array.ipynb was very useful, since it allows for a easy, lightweight test of pangeo. Since the upgrade running this notebook errors as per below. Are you able to point me in the direction of how to fix it? Thanks

from dask_kubernetes import KubeCluster
cluster = KubeCluster(n_workers=10)
cluster

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-00757e3889c1> in <module>
      1 from dask_kubernetes import KubeCluster
----> 2 cluster = KubeCluster(n_workers=10)
      3 cluster

/srv/conda/lib/python3.6/site-packages/dask_kubernetes/core.py in __init__(self, pod_template, name, namespace, n_workers, host, port, env, **kwargs)
    178             msg = ("Worker pod specification not provided. See KubeCluster "
    179                    "docstring for ways to specify workers")
--> 180             raise ValueError(msg)
    181 
    182         self.cluster = LocalCluster(ip=host or socket.gethostname(),

ValueError: Worker pod specification not provided. See KubeCluster docstring for ways to specify workers

Unforked

Just to let people know that I transferred the repo, instead of having it forked from my github.
cc @rabernat @jhamman @mrocklin , + please ping anyone else that needs to know.

jupyterhub extension compatibility

I tried to build https://github.com/rsignell-usgs/EarthAnnotator/tree/patch-1 on http://binder.pangeo.io and it failed with

ValueError:
"@jupyter-widgets/[email protected]" is not compatible with the current JupyterLab

because jupyter labextension install @jupyter-widgets/jupyterlab-manager is trying to install 0.38.1, but jupyterlab is at 0.34.

For compatibility we need these version pairings:
https://github.com/jupyter-widgets/ipywidgets/tree/master/packages/jupyterlab-manager#version

So we need to either update JupyterLab to 0.35 on http://binder.pangeo.io or specify

jupyter labextension install @jupyter-widgets/[email protected]

Note: It builds fine on mybinder.org

render MOM6 example with correct mesh geometry

Currently the MOM6 example by @gustavo-marques shows maps like this

image

@adcroft correctly pointed out that these are not the correct coordinates. We should be creating an irregular QuadMesh. This is apparently possible with holoviews / geoviews

This example from geoviews in particular looks like what we want.
image

However, the current MOM6 zarr dataset doesn't even have the geographic coordinates in it.

<xarray.Dataset>
Dimensions:     (Time: 7300, nv: 2, xh: 540, xq: 540, yh: 458, yq: 458)
Coordinates:
  * Time        (Time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * nv          (nv) float64 1.0 2.0
  * xh          (xh) float64 -286.7 -286.0 -285.3 -284.7 -284.0 -283.3 ...
  * xq          (xq) float64 -286.3 -285.7 -285.0 -284.3 -283.7 -283.0 ...
  * yh          (yh) float64 -79.2 -79.08 -78.95 -78.82 -78.69 -78.56 -78.43 ...
  * yq          (yq) float64 -79.14 -79.01 -78.89 -78.76 -78.63 -78.49 ...
Data variables:
    SSH         (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
    SSS         (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
    SST         (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
    Time_bnds   (Time, nv) float64 dask.array<shape=(7300, 2), chunksize=(7300, 2)>
    average_DT  (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
    average_T1  (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
    average_T2  (Time) float64 dask.array<shape=(7300,), chunksize=(7300,)>
    ePBL_h_ML   (Time, yh, xh) float32 dask.array<shape=(7300, 458, 540), chunksize=(365, 458, 540)>
Attributes:
    associated_files:  area_t: g.c2b6.GNYF.T62_t061.control.004.mom6.static.nc
    filename:          g.c2b6.GNYF.T62_t061.control.004.mom6.sfc.day._0001_00...
    grid_tile:         N/A
    grid_type:         regular
    title:             MOM6 g.c2b6.GNYF.T62_t061.control.004 Experiment

So before we can fix the plot, we need to fix the data. @gustavo-marques, could you add the lat / lon coordinates? You can just upload the directories to the existing zarr group and it should "just work". (A lot easier than adding a variable to a netCDF file!)

image won't build

This is what I get on the latest image of this repo

Fetching package metadata .................
Solving package specifications:
An unexpected error has occurred.
Please consider posting the following information to the
conda GitHub issue tracker at:

    https://github.com/conda/conda/issues



Current conda install:

               platform : linux-64
          conda version : 4.3.30
       conda is private : False
      conda-env version : 4.3.30
    conda-build version : not installed
         python version : 3.6.4.final.0
       requests version : 2.18.4
       root environment : /srv/conda  (writable)
    default environment : /srv/conda
       envs directories : /srv/conda/envs
                          /home/jovyan/.conda/envs
          package cache : /srv/conda/pkgs
                          /home/jovyan/.conda/pkgs
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.continuum.io/pkgs/main/linux-64
                          https://repo.continuum.io/pkgs/main/noarch
                          https://repo.continuum.io/pkgs/free/linux-64
                          https://repo.continuum.io/pkgs/free/noarch
                          https://repo.continuum.io/pkgs/r/linux-64
                          https://repo.continuum.io/pkgs/r/noarch
                          https://repo.continuum.io/pkgs/pro/linux-64
                          https://repo.continuum.io/pkgs/pro/noarch
            config file : /srv/conda/.condarc
             netrc file : None
           offline mode : False
             user-agent : conda/4.3.30 requests/2.18.4 CPython/3.6.4 Linux/4.4.111+ debian/stretch/sid glibc/2.26
                UID:GID : 1000:1000

`$ /srv/conda/bin/conda-env update -v -n root -f binder/environment.yml`




    Traceback (most recent call last):
      File "/srv/conda/lib/python3.6/site-packages/conda/exceptions.py", line 640, in conda_exception_handler
        return_value = func(*args, **kwargs)
      File "/srv/conda/lib/python3.6/site-packages/conda_env/cli/main_update.py", line 106, in execute
        installer.install(prefix, specs, args, env, prune=args.prune)
      File "/srv/conda/lib/python3.6/site-packages/conda_env/installers/conda.py", line 32, in install
        channel_priority_map=_channel_priority_map)
      File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 489, in install_actions_list
        for specs_by_prefix in required_solves]
      File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 489, in <listcomp>
        for specs_by_prefix in required_solves]
      File "/srv/conda/lib/python3.6/site-packages/conda/plan.py", line 631, in get_actions_for_dists
        pkgs = r.install(specs, installed, update_deps=update_deps)
      File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 813, in install
        pkgs = self.solve(specs, returnall=returnall)
      File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 853, in solve
        reduced_index = self.get_reduced_index(specs)
      File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 374, in get_reduced_index
        specs, features = self.verify_specs(specs)
      File "/srv/conda/lib/python3.6/site-packages/conda/resolve.py", line 300, in verify_specs
        raise ResolvePackageNotFound(bad_deps)
    conda.exceptions.ResolvePackageNotFound:
      - dask-kubernetes
      - python-kubernetes 4

Removing intermediate container 34826b18a83f
The command '/bin/sh -c conda env update -v -n root -f "binder/environment.yml" && conda clean -tipsy' returned a non-zero code: 1

machine-learning.ipynb on http://pangeo.pydata.org RuntimeError

Tried to run the the cell

from sklearn.externals import joblib

with joblib.parallel_backend('dask', scatter=[X, y]):
    grid_search.fit(X, y)

and got the output (it's long...)
Possibly the RuntimeError: Joblib backend requires either joblib>= '0.10.2' orsklearn > '0.17.1'. Please install or upgrade is the main issue?

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-8-085d2322fa37> in <module>()
      2 
      3 with joblib.parallel_backend('dask', scatter=[X, y]):
----> 4     grid_search.fit(X, y)

/opt/conda/lib/python3.6/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    637                                   error_score=self.error_score)
    638           for parameters, (train, test) in product(candidate_params,
--> 639                                                    cv.split(X, y, groups)))
    640 
    641         # if one choose to see train score, "out" will contain train score info

/opt/conda/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable)
    787                 # consumption.
    788                 self._iterating = False
--> 789             self.retrieve()
    790             # Make sure that we get a last message telling us we are done
    791             elapsed_time = time.time() - self._start_time

/opt/conda/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in retrieve(self)
    699                     self._output.extend(job.get(timeout=self.timeout))
    700                 else:
--> 701                     self._output.extend(job.get())
    702 
    703             except BaseException as exception:

/opt/conda/lib/python3.6/site-packages/distributed/joblib.py in get()
    249 
    250         def get():
--> 251             return ref().result()
    252 
    253         future.get = get # monkey patch to achieve AsyncResult API

/opt/conda/lib/python3.6/site-packages/distributed/client.py in result(self, timeout)
    190                                   raiseit=False)
    191         if self.status == 'error':
--> 192             six.reraise(*result)
    193         elif self.status == 'cancelled':
    194             raise result

/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    690                 value = tp()
    691             if value.__traceback__ is not tb:
--> 692                 raise value.with_traceback(tb)
    693             raise value
    694         finally:

/opt/conda/lib/python3.6/site-packages/distributed/protocol/pickle.py in loads()
     57 def loads(x):
     58     try:
---> 59         return pickle.loads(x)
     60     except Exception:
     61         logger.info("Failed to deserialize %s", x[:10000], exc_info=True)

/opt/conda/lib/python3.6/site-packages/distributed/joblib.py in <module>()
     38     _bases.append(ParallelBackendBase)
     39 if not _bases:
---> 40     raise RuntimeError("Joblib backend requires either `joblib` >= '0.10.2' "
     41                        " or `sklearn` > '0.17.1'. Please install or upgrade")
     42 

RuntimeError: Joblib backend requires either `joblib` >= '0.10.2'  or `sklearn` > '0.17.1'. Please install or upgrade
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d45c7b8>, <Future finished exception=CancelledError(['_fit_and_score-batch-c8bc3da59762435bb023dded3c77fb1c'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c8bc3da59762435bb023dded3c77fb1c']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d459f28>, <Future finished exception=CancelledError(['_fit_and_score-batch-c4ce3d7618034bec8f259a15b9b99b3f'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c4ce3d7618034bec8f259a15b9b99b3f']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6e527620>, <Future finished exception=CancelledError(['_fit_and_score-batch-4ca1e7b762c44a0d930e15f6c6a981f9'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-4ca1e7b762c44a0d930e15f6c6a981f9']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6edb52f0>, <Future finished exception=CancelledError(['_fit_and_score-batch-29b5dd78588d448a8eb6e33d0d7400ca'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-29b5dd78588d448a8eb6e33d0d7400ca']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6fddf950>, <Future finished exception=CancelledError(['_fit_and_score-batch-c0c51b4512904a449c9cd169b95b749e'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c0c51b4512904a449c9cd169b95b749e']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6edb11e0>, <Future finished exception=CancelledError(['_fit_and_score-batch-50ac41eee8364dcbb7b42e46ef9b0912'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-50ac41eee8364dcbb7b42e46ef9b0912']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6ed93378>, <Future finished exception=CancelledError(['_fit_and_score-batch-c20e4a9fc8654ae290286dbe6fab8c14'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-c20e4a9fc8654ae290286dbe6fab8c14']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6d45e048>, <Future finished exception=CancelledError(['_fit_and_score-batch-eea80eb9ac67456abbc3f6ab66742105'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-eea80eb9ac67456abbc3f6ab66742105']
tornado.application - ERROR - Exception in callback functools.partial(<function wrap.<locals>.null_wrapper at 0x7f4f6e527e18>, <Future finished exception=CancelledError(['_fit_and_score-batch-f9de1c20b4034245968ae293f0296956'],)>)
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 759, in _run_callback
    ret = callback()
  File "/opt/conda/lib/python3.6/site-packages/tornado/stack_context.py", line 276, in null_wrapper
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/tornado/ioloop.py", line 780, in _discard_future_result
    future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1107, in run
    yielded = self.gen.throw(*exc_info)
  File "/opt/conda/lib/python3.6/site-packages/distributed/joblib.py", line 241, in callback_wrapper
    result = yield _wait([future])
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1099, in run
    value = future.result()
  File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1113, in run
    yielded = self.gen.send(value)
  File "/opt/conda/lib/python3.6/site-packages/distributed/client.py", line 3346, in _wait
    raise CancelledError(cancelled)
concurrent.futures._base.CancelledError: ['_fit_and_score-batch-f9de1c20b4034245968ae293f0296956']

error in xarray-data.ipynb ?

running the notebook, on the line

elevation = ds['elevation'].isel(ensemble=0).persist()

It seem like .isel(ensemble=0) is unnecessary as ds['elevation'] does not seem to have the ensemble dimension (and thus raise an error, at least for me and pangeo.pydata.org.

>>> ds['elevation']
<xarray.DataArray 'elevation' (lat: 224, lon: 464)>
dask.array<shape=(224, 464), dtype=float64, chunksize=(224, 464)>
Coordinates:
  * lat      (lat) float64 25.06 25.19 25.31 25.44 25.56 25.69 25.81 25.94 ...
  * lon      (lon) float64 -124.9 -124.8 -124.7 -124.6 -124.4 -124.3 -124.2 ...
Attributes:
    long_name:      Terrain Elevation
    standard_name:  elevation
    units:          meters

GCS zarr datasets can only be opened with token='anon' from binder

I am trying out these examples with the pangeo binder.

In pangeo.pydata.org, the following code works:

ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/SOSE')

But in hub.binder.pangeo.io, it fails with

_call exception: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1/instance/service-accounts/default/?recursive=true (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8d48ca4ac8>: Failed to establish a new connection: [Errno 110] Connection timed out',))
Traceback (most recent call last):
  File "/srv/conda/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/srv/conda/lib/python3.6/site-packages/urllib3/util/connection.py", line 83, in create_connection
    raise err
  File "/srv/conda/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

I find I need to do

ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/SOSE', gcs=gcsfs.GCSFileSystem(token='anon')))

which is significantly uglier and more complicated.

Can we somehow make anonymous tokens the default for gcsfs?

cc @martindurant, @jhamman

machine-learning.ipynb does not work

Executing line

km = dask_ml.cluster.KMeans(n_clusters=3, init_max_iter=2, oversampling_factor=10)
km.fit(X)

results in

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-14-7d3bc4c475f7> in <module>()
      1 km = dask_ml.cluster.KMeans(n_clusters=3, init_max_iter=2, oversampling_factor=10)
----> 2 km.fit(X)

/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in fit(self, X, y)
    197             max_iter=self.max_iter,
    198             init_max_iter=self.init_max_iter,
--> 199             tol=self.tol,
    200         )
    201         self.cluster_centers_ = centroids

/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in k_means(X, n_clusters, init, precompute_distances, n_init, max_iter, verbose, tol, random_state, copy_x, n_jobs, algorithm, return_n_iter, oversampling_factor, init_max_iter)
    268         random_state=random_state,
    269         oversampling_factor=oversampling_factor,
--> 270         init_max_iter=init_max_iter,
    271     )
    272     if return_n_iter:

/opt/conda/lib/python3.6/site-packages/dask_ml/cluster/k_means.py in _kmeans_single_lloyd(X, n_clusters, max_iter, init, verbose, x_squared_norms, random_state, tol, precompute_distances, oversampling_factor, init_max_iter)
    550             counts = da.maximum(counts, 1)
    551             new_centers = new_centers / counts[:, None]
--> 552             new_centers, = compute(new_centers)
    553 
    554             # Convergence check

/opt/conda/lib/python3.6/site-packages/dask/base.py in compute(*args, **kwargs)
    400     keys = [x.__dask_keys__() for x in collections]
    401     postcomputes = [x.__dask_postcompute__() for x in collections]
--> 402     results = schedule(dsk, keys, **kwargs)
    403     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    404 

/opt/conda/lib/python3.6/site-packages/distributed/client.py in get(self, dsk, keys, restrictions, loose_restrictions, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, **kwargs)
   2191             try:
   2192                 results = self.gather(packed, asynchronous=asynchronous,
-> 2193                                       direct=direct)
   2194             finally:
   2195                 for f in futures.values():

/opt/conda/lib/python3.6/site-packages/distributed/client.py in gather(self, futures, errors, maxsize, direct, asynchronous)
   1566             return self.sync(self._gather, futures, errors=errors,
   1567                              direct=direct, local_worker=local_worker,
-> 1568                              asynchronous=asynchronous)
   1569 
   1570     @gen.coroutine

/opt/conda/lib/python3.6/site-packages/distributed/client.py in sync(self, func, *args, **kwargs)
    651             return future
    652         else:
--> 653             return sync(self.loop, func, *args, **kwargs)
    654 
    655     def __repr__(self):

/opt/conda/lib/python3.6/site-packages/distributed/utils.py in sync(loop, func, *args, **kwargs)
    275             e.wait(10)
    276     if error[0]:
--> 277         six.reraise(*error[0])
    278     else:
    279         return result[0]

/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    691             if value.__traceback__ is not tb:
    692                 raise value.with_traceback(tb)
--> 693             raise value
    694         finally:
    695             value = None

/opt/conda/lib/python3.6/site-packages/distributed/utils.py in f()
    260             if timeout is not None:
    261                 future = gen.with_timeout(timedelta(seconds=timeout), future)
--> 262             result[0] = yield future
    263         except Exception as exc:
    264             error[0] = sys.exc_info()

/opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self)
   1097 
   1098                     try:
-> 1099                         value = future.result()
   1100                     except Exception:
   1101                         self.had_exception = True

/opt/conda/lib/python3.6/site-packages/tornado/gen.py in run(self)
   1105                     if exc_info is not None:
   1106                         try:
-> 1107                             yielded = self.gen.throw(*exc_info)
   1108                         finally:
   1109                             # Break up a reference to itself

/opt/conda/lib/python3.6/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1445                             six.reraise(type(exception),
   1446                                         exception,
-> 1447                                         traceback)
   1448                     if errors == 'skip':
   1449                         bad_keys.add(key)

/opt/conda/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
    690                 value = tp()
    691             if value.__traceback__ is not tb:
--> 692                 raise value.with_traceback(tb)
    693             raise value
    694         finally:

/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in _compile_for_args()
    366                     e.patch_message(''.join(e.args) + help_msg)
    367             # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 368             raise e
    369 
    370     def inspect_llvm(self, signature=None):

/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in _compile_for_args()
    323                 argtypes.append(self.typeof_pyval(a))
    324         try:
--> 325             return self.compile(tuple(argtypes))
    326         except errors.TypingError as e:
    327             # Intercept typing error that may be due to an argument

/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in compile()
    651 
    652                 self._cache_misses[sig] += 1
--> 653                 cres = self._compiler.compile(args, return_type)
    654                 self.add_overload(cres)
    655                 self._cache.save_overload(sig, cres)

/opt/conda/lib/python3.6/site-packages/numba/dispatcher.py in compile()
     81                                       args=args, return_type=return_type,
     82                                       flags=flags, locals=self.locals,
---> 83                                       pipeline_class=self.pipeline_class)
     84         # Check typing error if object mode is used
     85         if cres.typing_error is not None and not flags.enable_pyobject:

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in compile_extra()
    871     pipeline = pipeline_class(typingctx, targetctx, library,
    872                               args, return_type, flags, locals)
--> 873     return pipeline.compile_extra(func)
    874 
    875 

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in compile_extra()
    365         self.lifted = ()
    366         self.lifted_from = None
--> 367         return self._compile_bytecode()
    368 
    369     def compile_ir(self, func_ir, lifted=(), lifted_from=None):

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in _compile_bytecode()
    802         """
    803         assert self.func_ir is None
--> 804         return self._compile_core()
    805 
    806     def _compile_ir(self):

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in _compile_core()
    789         self.define_pipelines(pm)
    790         pm.finalize()
--> 791         res = pm.run(self.status)
    792         if res is not None:
    793             # Early pipeline completion

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in run()
    251                     # No more fallback pipelines?
    252                     if is_final_pipeline:
--> 253                         raise patched_exception
    254                     # Go to next fallback pipeline
    255                     else:

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in run()
    243                 try:
    244                     event(stage_name)
--> 245                     stage()
    246                 except _EarlyPipelineCompletion as e:
    247                     return e.result

/opt/conda/lib/python3.6/site-packages/numba/compiler.py in stage_generic_rewrites()
    478         with self.fallback_context(msg):
    479             rewrites.rewrite_registry.apply('before-inference',
--> 480                                             self, self.func_ir)
    481 
    482     def stage_nopython_rewrites(self):

/opt/conda/lib/python3.6/site-packages/numba/rewrites/registry.py in apply()
     68                 key, block = work_list.pop()
     69                 matches = rewrite.match(func_ir, block, pipeline.typemap,
---> 70                                         pipeline.calltypes)
     71                 if matches:
     72                     if config.DEBUG or config.DUMP_IR:

/opt/conda/lib/python3.6/site-packages/numba/rewrites/static_getitem.py in match()
     19             if expr.op == 'getitem':
     20                 try:
---> 21                     const = func_ir.infer_constant(expr.index)
     22                 except errors.ConstantInferenceError:
     23                     continue

/opt/conda/lib/python3.6/site-packages/numba/ir.py in infer_constant()
    962         if isinstance(name, Var):
    963             name = name.name
--> 964         return self._consts.infer_constant(name)
    965 
    966     def get_definition(self, value, lhs_only=False):

/opt/conda/lib/python3.6/site-packages/numba/consts.py in infer_constant()
     32         if name not in self._cache:
     33             try:
---> 34                 self._cache[name] = (True, self._do_infer(name))
     35             except ConstantInferenceError as exc:
     36                 # Store the exception args only, to avoid keeping

/opt/conda/lib/python3.6/site-packages/numba/consts.py in _do_infer()
     58                 "no single definition for %r" % (name,))
     59         try:
---> 60             const = defn.infer_constant()
     61         except ConstantInferenceError:
     62             if isinstance(defn, ir.Expr):

/opt/conda/lib/python3.6/site-packages/numba/ir.py in infer_constant()
    344 
    345     def infer_constant(self):
--> 346         raise ConstantInferenceError('%s' % self, loc=self.loc)
    347 
    348 

/opt/conda/lib/python3.6/site-packages/numba/errors.py in __init__()
    526         self.value = value
    527         msg = "Cannot make a constant from: %s" % value
--> 528         super(ConstantInferenceError, self).__init__(msg, loc=loc)
    529 
    530 

/opt/conda/lib/python3.6/site-packages/numba/errors.py in __init__()
    386         if loc:
    387             super(NumbaError, self).__init__(
--> 388                 highlight("%s\n%s\n" % (msg, loc.strformat())))
    389         else:
    390             super(NumbaError, self).__init__(highlight("%s" % (msg,)))

/opt/conda/lib/python3.6/site-packages/numba/ir.py in strformat()
     85 
     86             ret.extend(selected[:-1])
---> 87             ret.append(_termcolor.highlight(selected[-1]))
     88 
     89             # point at the problem with a caret

IndexError: Failed at nopython (nopython rewrites)
list index out of range

Improving Performance with Cloud-Optimized Geotiffs (COGs) - xarray,rasterio,dask

As we mentioned in our landsat8 demo blog post (https://medium.com/pangeo/cloud-native-geoprocessing-of-earth-observation-satellite-data-with-pangeo-997692d91ca2), there is still much room for improvement.

Here is a nice benchmarking analysis of reading cloud-optimized-geotiffs (COGs) on AWS: https://github.com/opendatacube/benchmark-rio-s3/blob/master/report.md#rasterio-configuration

And discussion of the report here:
http://osgeo-org.1560.x6.nabble.com/Re-Fwd-Cloud-optimized-GeoTIFF-configuration-specifics-SEC-UNOFFICIAL-tt5367948.html

It would be great to do similar benchmarking with our example, and see if there are simple ways to improve how COGs are read with the combination of xarray, dask, and rasterio.

Pinging some notebook authors on this one, @mrocklin, @jhamman, @rsignell-usgs, @darothen !

The landsat notebook does not work: gdal installation problem

Same issue as pangeo-data/pangeo-tutorial#14. Just opening it here too to remember fixing it there also.

It seems the Docker image built from this repo does not have rasterio/gdal correctly installed. This result with an error when importing rasterio, saying that libgdal is missing.

I believe that the fix may not be too hard, I encountered this today when running a tutorial at a french lab. Did not have time to investigate yet, we may just need to update the repotodocker build, or point to common Pangeo images.

Setting up Dask Distributed Cluster

In this Notebook, it looks like a cluster cannot be set up using cluster = KubeCluster(n_workers=10) any more. Instead, one should follow the Dask Kubernetes documentation and use a worker specification YAML file.
However, when I execute the following on pangeo's binder:

cluster = KubeCluster.from_yaml('worker-spec.yml')
cluster.scale_up(10)
client = Client(cluster)
client

I cannot open the dashboard link (site 10.48.180.27 can't be reached). Any idea why?

error in SOSE_tracer_budgets.ipynb

gateway = Gateway() from the first cell has incurred the error message (see below). Could anyone help with this? Thank you!

ValueError Traceback (most recent call last)
in
2 from dask_gateway import Gateway
3
----> 4 gateway = Gateway()
5 cluster = gateway.new_cluster()
6 cluster.scale(30)

~/opt/anaconda3/lib/python3.7/site-packages/dask_gateway/client.py in init(self, address, proxy_address, auth, asynchronous, loop)
266 if address is None:
267 raise ValueError(
--> 268 "No dask-gateway address provided or found in configuration"
269 )
270 address = address.rstrip("/")

ValueError: No dask-gateway address provided or found in configuration

Landsat example with all hvplot graphics

@scottyhq and @rabernat, partly to bone up on my hvplot/pyviz skills, I replaced all the plots in the Landsat NDVI notebook with hvplot interactive graphics here.

Not sure whether you want this as a PR or how best to handle it, but just wanted to let you know since it's pretty cool. For, example, when live, you can zoom on one of the small multiple plots, and all the plots zoom together:
2018-09-20_13-05-42

Thanks to the pangeo binder instance, you can also run the notebook live:
Binder

2018-09-20_15-44-27

error in Landsat8-cog-ndvi-hvplot.iynb

running the line in the Landsat8-cog-ndvi-hvplot.iynb

display(band1.hvplot(rasterize=True, width=600, height=400, cmap='viridis'))

results:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/utils.py in __len__(self)
    501         try:
--> 502             return self.shape[0]
    503         except IndexError:

IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<timed eval> in <module>

/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/plotting/core.py in __call__(self, x, y, kind, **kwds)
     70                 return pn.panel(plot, **panel_dict)
     71 
---> 72         return self._get_converter(x, y, kind, **kwds)(kind, x, y)
     73 
     74     def _get_converter(self, x=None, y=None, kind=None, **kwds):

/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/converter.py in __call__(self, kind, x, y)
    942                 obj = DynamicMap(cbcallable, streams=[self.stream])
    943             else:
--> 944                 obj = method(x, y)
    945 
    946         if self.crs and self.project:

/srv/conda/envs/iocasdata/lib/python3.7/site-packages/hvplot/converter.py in rgb(self, x, y, z, data)
   1586             z = list(data.data_vars)[0]
   1587         data = data[z]
-> 1588         nbands = len(data.coords[bands])
   1589         if nbands < 3:
   1590             raise ValueError('Selected bands coordinate (%s) has only %d channels,'

/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/dataarray.py in __len__(self)
    541 
    542     def __len__(self) -> int:
--> 543         return len(self.variable)
    544 
    545     @property

/srv/conda/envs/iocasdata/lib/python3.7/site-packages/xarray/core/utils.py in __len__(self)
    502             return self.shape[0]
    503         except IndexError:
--> 504             raise TypeError("len() of unsized object")
    505 
    506 

TypeError: len() of unsized object

and package versions:

# Print package versions
print('xarray version: ', xr.__version__)
print('rasterio version: ', rasterio.__version__)
print('hvplot version: ', hvplot.__version__)

xarray version: 0.15.1
rasterio version: 1.1.3
hvplot version: 0.5.2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.