Giter VIP home page Giter VIP logo

Comments (9)

euronion avatar euronion commented on June 11, 2024

No. As far as I know there is no tutorial for setting up new data sources at the moment.

The ERA-5 data is also downloaded from the CDS, see datasets/era5.py. You should be able to use it as a starting point. I expect the structure for the request and postprocessing to be similar.

Important side note:
@coroa Has been working on a new data backend. This will also affect the way the data importers work. You might want to check with him first to see what will change in the upcoming version.

from atlite.

coroa avatar coroa commented on June 11, 2024

from atlite.

coroa avatar coroa commented on June 11, 2024

I put a bit more thought into locating the correct grid cells. Let me illustrate the problem and what I assume the solution is (and I think it would be great to integrate that into atlite).

Consider the following plot of the river Saale in Thuringia and its basin, which flows into the Elbe (not shown):
GWK56_meanfriv
Red lines are the river itself (solid is the river network of Saale up to level 3, dashed its tributaries on level 4, where levels are taken from the German Gewässerkennzahl system). In Orange, I added the delineations of the catchment basins as provided by the Hydrobasins dataset (in level 7, i think, but would have to check). In the background I show the mean river flow from a dataset produced by the HD-model maintained by the Helmholtz Zentrum Geesthacht, which is similar to LISFLOOD, albeit simpler. The colored dots show GRDC stations for which measured daily river flow data can be requested from the BfG (a German institution).

The problem is now that for example for the GRDC station marked in orange the closest river flow grid cell holds the river flow through the main river stem coming from the South, while the station only sits at a tiny tributary entering from the right. So as a correction, in this case one would have to take the river flow immediately to the East.

From the EFAS quote above and a discussion I had with HZG earlier, I gather that the preferred solution to this conundrum is to publish the catchment basin area for each grid cell alongside the grid.

To locate the best grid cell for a particular latitude, longitude position, you then compute the total area of the upstream catchment basins (based on hydrobasins for example) and find the grid cell close to your location that best represents the upstream area.

from atlite.

matteodefelice avatar matteodefelice commented on June 11, 2024

To locate the best grid cell for a particular latitude, longitude position, you then compute the total area of the upstream catchment basins (based on hydrobasins for example) and find the grid cell close to your location that best represents the upstream area.

I am not sure to have understood correctly (I am not an expert for hydrological data). What does it mean for a grid cell to represent the upstream area?

from atlite.

coroa avatar coroa commented on June 11, 2024

Neither am I, but what I mean is:

Look through all neighbouring grid cells (probably only direct neighbours) and use the one where the hydrobasins estimated upstream area is similar to the one provided by the EFAS grid.

from atlite.

coroa avatar coroa commented on June 11, 2024

Estimation of upstream area using hydrobasins would work something like:

gpd.sjoin(powerplants, hydrobasins[['UP_AREA', 'geometry']], how='left', op='within').rename(columns={'UP_AREA': 'estim_area'})

A test with the reported upstream areas for the German stations in the GRDC stations catalogue finds a relatively good agreement for rivers with high flow volumes:

import os
import numpy as np
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

HYDRO_BASINS_DATA_PATH = os.path.join(
    'hybas_eu_lev01-12_v1c',
    'hybas_eu_lev12_v1c.shp'
)

GRDC_PATH = 'grdc_stations.xlsx'

hydrobasins = gpd.read_file(HYDRO_BASINS_DATA_PATH)

grdc = pd.read_excel(GRDC_PATH, na_values=[-999, 'n.a.'])
grdc_de = grdc.loc[grdc.country == 'DE']

grdc_de = gpd.GeoDataFrame(grdc_de, geometry=grdc_de[['long', 'lat']].apply(Point, axis=1), crs=dict(init='epsg:4326'))
grdc_de =  gpd.sjoin(grdc_de, hydrobasins[['UP_AREA', 'geometry']], how='left', op='within').rename(columns={'UP_AREA': 'estim_area'})

pd.DataFrame(grdc_de).plot.scatter(x='area', y='estim_area', s=grdc_de.r_volume_yr)
plt.xscale('log')
plt.yscale('log')

grdc_estim_upstream_area

from atlite.

coroa avatar coroa commented on June 11, 2024

Anyway, since the link to upArea.nc the EFAS available data page is still dead, it makes sense to reach out to them, anyway. Would you mind doing that, Matteo?

from atlite.

matteodefelice avatar matteodefelice commented on June 11, 2024

Sure! I will let you know once I get a reply.

from atlite.

coroa avatar coroa commented on June 11, 2024

It seems, your email was received, since the wiki page has just now been updated with a note, that the upstream area is provided as an extra variable in the netcdf files:

In [5]: ds.upArea                                                                                                      
Out[5]: 
<xarray.DataArray 'upArea' (y: 950, x: 1000)>
[950000 values with dtype=float32]
Coordinates:
  * y           (y) float64 5.498e+06 5.492e+06 ... 7.575e+05 7.525e+05
  * x           (x) float64 2.502e+06 2.508e+06 ... 7.492e+06 7.498e+06
    time        datetime64[ns] ...
    step        timedelta64[ns] ...
    surface     int64 ...
    latitude    (y, x) float32 ...
    longitude   (y, x) float32 ...
    valid_time  datetime64[ns] ...
Attributes:
    grid_mapping:    lambert_azimuthal_equal_area
    long_name:       upstream_area
    standard_name:   upstream
    units:           M2
    esri_pe_string:  PROJCS["ETRS_1989_LAEA",GEOGCS["GCS_ETRS_1989",DATUM["D_...
    cell_methods:    time: mean

from atlite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.