nsidc / nsidc-data-access-notebook Goto Github PK

A Jupyter notebook exploring data coverage, size, and customization service availability along with direct data download utilizing the NSIDC DAAC's access and service API.

Jupyter Notebook 100.00%

nsidc-data-access-notebook's Introduction

NSIDC-Data-Access-Notebook

A Jupyter notebook exploring data coverage, size, and customization service availability along with direct data download utilizing the NSIDC DAAC's Data Access and Service API. If you are new to Jupyter Notebooks, you can read the documentation, or you can walk through the user interface tour once you have loaded the notebook by selecting Help -> User Interface Tour in the menubar.

This notebook was originally developed for the April 2019 USFS – NASA Joint Applications Workshop.

Usage with Binder

The Binder button above allows you to explore and run the notebook in a shared cloud computing environment without the need to install dependencies on your local machine. Note that this option will not directly download data to your computer; instead the data will be downloaded to the cloud environment. If you are interested in bulk downloading data, we recommend running this notebook locally using the Conda or Docker options.

Usage with Docker

Docker is the preferred way of running these notebooks, refer to the installation guide for your operating system Install Docker. We are going to use NSIDC's tutorial image, we can pull it from the Docker registry with:

docker pull nsidc/tutorials

On Mac OSX or Linux

Install Docker. Use the left-hand navigation to select the appropriate install depending on operating system.
Download the NSIDC-Data-Access-Notebook repository from Github.
Unzip the file, and open a terminal window in the NSIDC-Data-Access-Notebook folder's location.
From the terminal window, launch the docker container using the following command, replacing [path/notebook_folder] with your path and notebook folder name:

docker run --name tutorials -p 8888:8888 -v [path/notebook_folder]:/home/jovyan/work nsidc/tutorials

Example:

docker run --name tutorials -p 8888:8888 -v /Users/name/Desktop/NSIDC-Data-Access-Notebook:/home/jovyan/work nsidc/tutorials

If you want to mount a directory with write permissions you need to grant the container the same permissions as the one on the directory to be mounted and tell it that has "root" access (within the container) this is important if you want to persist your work or download data to a local directory and not just the docker container.

docker run --name tutorials -e NB_UID=$(id -u) --user root -p 8888:8888 -v  /Users/name/Desktop/NSIDC-Data-Access-Notebook:/home/jovyan/work nsidc/tutorials

The initialization will take some time and will require 1.8 GB of space. Once the startup is complete you will see a line of output similar to this:

http://(6a8bfa6a8518 or 127.0.0.1):8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

Copy everything from the :8888 to the end. Open up a web browser and in the address field type localhost, paste the copied text, and hit return. The address should look something like this:

localhost:8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

You will be brought to the Jupyter Lab interface running through the Docker container. The left side of the interface displays your local directory structure. Navigate to the work folder of the NSIDC-Data-Access-Notebook repository folder. You can now interact with the notebooks to explore and access data.

Note: if you prefer to use Jupyterlab you need to use the following URL and login with the same token: http://127.0.0.1:8888/lab

On Windows

Install Docker.
Download the NSIDC-Data-Access-Notebook repository from Github.
Unzip the file, and open a terminal window (use Command Prompt or PowerShell, not PowerShell ISE) in the NSIDC-Data-Access-Notebook folder's location.
From the terminal window, launch the docker container using the following command, replacing [path\notebook_folder] with your path and notebook folder name:

docker run --name tutorials -p 8888:8888 -v [path\notebook_folder]:/home/jovyan/work nsidc/tutorials

Example:

docker run --name tutorials -p 8888:8888 -v C:\notebook_folder:/home/jovyan/work nsidc/tutorials

docker run --name tutorials --user root -p 8888:8888 -v C:\notebook_folder:/home/jovyan/work nsidc/tutorials

The initialization will take some time and will require 1.8 GB of space. Once the startup is complete you will see a line of output similar to this:

To access the notebook, open this file in a browser:
        file:///home/jovyan/.local/share/jupyter/runtime/nbserver-6-open.html
    Or copy and paste one of these URLs:
        http://(6a8bfa6a8518 or 127.0.0.1):8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

Follow the instructions and copy one of the URLs into a web browser and hit return. The address should look something like this:

http://127.0.0.1:8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

You will now see the NSIDC-Data-Access repository within the Jupyter Notebook interface. Navigate to /work to open the notebooks.
You can now interact with the notebooks to explore and access data.

Note: if you prefer to use Jupyterlab you need to use the following URL and login with the same token: http://127.0.0.1:8888/lab

Usage with Conda

Install miniconda3 (Python 3.7) for your platform from https://docs.conda.io/en/latest/miniconda.html
Download the NSIDC-Data-Access-Notebook from Github.
Unzip the file, and open a command line or terminal window in the NSIDC-Data-Access-Notebook folder's location.
From a command line or terminal window, install the required environment with the following command:

conda env create -f binder/environment.yml

you should now see that the dependencies were installed and our environment is ready to be used.
Activate the environment with source activate icepick or conda activate icepick
Launch the notebook locally with the following command:

jupyter lab

This should open a browser window displaying your current working directory contents. Navigate to the notebooks folder and click on the Customize and Access NSIDC Data.ipynb file. You can now interact with the notebook to explore and access data.

nsidc-data-access-notebook's People

Contributors

Stargazers

Watchers

nsidc-data-access-notebook's Issues

NameError: name 'Boundingshape' is not defined

When I run the notebook I get this error at the cell "Create the API endpoint":

NameError                                 Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_15140\1286720228.py in <module>
     22                   'time': time_var,
     23                   'polygon': polygon,
---> 24                   'Boundingshape': Boundingshape,
     25                   'format': reformat,
     26                   'projection': projection,

NameError: name 'Boundingshape' is not defined

The problem might be that I provided the area of interest with a shapefile, but in the 'subdict' object, the field 'spatialSubsettingShapefile' is listed as 'false'. 'temporalSubsetting' is also 'false' even though I provided dates.

This seems to result in Boundingshape never being defined in the block titled "Select subsetting, reformatting, and reprojection service options, if available." because the expected logical conditions are not met.

Here are the values for some of the variables in my workspace:

aoi = '2'
subdict = {
      'id': 'HEG',
       **'spatialSubsetting': 'true',**
       **'spatialSubsettingShapefile': 'false',**
       **'temporalSubsetting': 'false',**
       'type': 'both',
       'maxGransSyncRequest': '100',
       'maxGransAsyncRequest': '2000'
}

Relevant code block input and output below, and let me know if you need any more information:


Input short name, e.g. ATL03, here:  MOD10_L2
Input start date in yyyy-MM-dd format:  2022-05-29
Input start time in HH:mm:ss format:  14:00:00
Input end date in yyyy-MM-dd format:  2022-06-22
Input end time in HH:mm:ss format:  16:00:00

# Here I just used the default shapefile that came with the notebook
Simplified polygon coordinates based on shapefile input: -51.1203803853486,69.21453533085385,-51.25214005333067,69.18690191925793,-50.04906889941042,69.08708570400971,-48.98867272012586,68.86377341849372,-48.47406719345924,68.95294818787565,-48.22490237564403,69.07628146713546,-48.64390544471899,69.42180752487434,-49.07439704242353,69.52418890858146,-50.41130570916474,69.24203271223362,-51.1203803853486,69.21453533085385

There are 168 granules of MOD10_L2 version 61 over my area and time of interest.
The average size of each granule is 10.55 MB and the total size of all 168 granules is 1772.72 MB

These reformatting options are available: ['GeoTIFF']
If you would like to reformat, copy and paste the reformatting option you would like (make sure to omit quotes, e.g. GeoTIFF), otherwise leave blank. GeoTIFF

These reprojection options are available with your requested format: ['GEOGRAPHIC', 'UNIVERSAL TRANSVERSE MERCATOR', 'POLAR STEREOGRAPHIC', 'SINUSOIDAL']
If you would like to reproject, copy and paste the reprojection option you would like (make sure to omit quotes), otherwise leave blank. UNIVERSAL TRANSVERSE MERCATOR

Variable subsetting is available. Would you like to subset a selection of variables? (y/n) n

There will be 1 total order(s) processed for our MOD10_L2 request.

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_5656\1286720228.py in <module>
     22                   'time': time_var,
     23                   'polygon': polygon,
---> 24                   'Boundingshape': Boundingshape,
     25                   'format': reformat,
     26                   'projection': projection,

NameError: name 'Boundingshape' is not defined

Connection Issues

For the notebook provided, I am currently getting the following error quite frequently and cannot easily make get requests:

ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

I get this when running the cell:

# Query service capability URL 

from xml.etree import ElementTree as ET

capability_url = f'https://n5eil02u.ecs.nsidc.org/egi/capabilities/{short_name}.{latest_version}.xml'

print(capability_url)
# Create session to store cookie and pass credentials to capabilities url

session = requests.session()
s = session.get(capability_url)
response = session.get(s.url,auth=(uid,pswd))

root = ET.fromstring(response.content)

Is this a maintenance issues? Or working remotely away from a NASA facility?

Hope you are all taking good care during these trying and scary times.

Tracking limits and cancelling orders

Thank you for your notebooks in navigating the NSIDC API.

I am downloading a lot of data and am wondering if there are ways to:

Keep track of orders associated with my earthdata UID or a given token; if so, can I cancel previous orders?
Track rate limits and order limits. Right now, experimenting with your notebook to download lots of data, I have already exceeded the limits. The async requests are indefinitely pending and the sync requests output the message Rejected request due to capacity limitations.

Download seems to be truncated

I am trying to download a year's worth of the 8-day MODIS10A data product (500-m Snow Cover) via this notebook. According to the Earthdata interface, I need 13367 granules, and the number of zip files with 2000 images each is calculated by the script to be 7 and 7 zip files are processed. However, when I look at the number of files produced, it's 4000. Is this a known problem?

docker: Error response from daemon

once i quit the jupyter notebook i started in terminal with docker running, how could i start it again? i tried to follow the guidance again but i got an error like this:
docker: Error response from daemon: driver failed programming external connectivity on endpoint recursing_lovelace (283e0e1985b83cffb5615d65818f66c8f780f03c3fba00227e0c43a9bc5be5db): Error starting userland proxy: /forwards/expose/port returned unexpected status: 500.

Add some error handling before processing results from CMR in "Customize and Access NSIDC Data" notebook

More details: #10 (comment)

Determine how many granules exist over this time and area of interest

I run the Customize and Access NSIDC Data to download the data, but an error occured in the part of Determine how many granules exist over this time and area of interest.

Is there something wrong? all the parameters are default.

KeyError:'token'

hello, i have correctly typed my username, password and email, but when i run the second code cell it got error as below:

KeyError                                  Traceback (most recent call last)
<ipython-input-18-37a76871ddd5> in <module>
     14 headers={'Accept': 'application/json'}
     15 response = requests.post(token_api_url, json=data, headers=headers)
---> 16 token = json.loads(response.content)['token']['id']
     17 print(token)

KeyError: 'token'

Is there any solutions?thank you!

capability url returns invalid request

I have a workflow to request subsetted ICESat-2 data based on this data access notebook. It worked until yesterday, but now the capability url returns an invalid request. The same thing happens when running this notebook in Binder, for all the data product short names that I have tried.

capability_url: https://n5eil02u.ecs.nsidc.org/egi/capabilities/ATL03.005.xml
response: <Response [400]> b'{"error":"invalid_request"}'

This triggers an error in the next line:

Traceback (most recent call last):

  File "/srv/conda/envs/notebook/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "/tmp/ipykernel_168/938956396.py", line 13, in <module>
    root = ET.fromstring(response.content)

  File "/srv/conda/envs/notebook/lib/python3.7/xml/etree/ElementTree.py", line 1315, in XML
    parser.feed(text)

  File "<string>", line unknown
ParseError: not well-formed (invalid token): line 1, column 0

Any idea what the issue could be here?

Thanks,
Philipp