ec-jrc / thalassa Goto Github PK

View Code? Open in Web Editor NEW

17.0 10.0 14.0 15.57 MB

Large Scale Sea level visualizations of unstructured mesh data

License: European Union Public License 1.2

Python 73.38% Jupyter Notebook 24.60% Makefile 1.34% Shell 0.68%

hvplot datashader unstructured-meshes large-dataset

thalassa's Introduction

Thalassa

Thalassa is a library for visualizing unstructured mesh data with a focus on large scale sea level data

It builds upon geoviews and datashader and can easily handle meshes with millions of nodes interactively.

Thalassa is currently supporting visualization of the output of the following solvers:

Adding support for new solvers is relatively straight-forward.

Installation

PyPI

Install the binary dependencies:

python >= 3.9

Install from PyPI with:

pip install thalassa

Conda

You can also install using conda/mamba:

mamba install -y -c conda-forge thalassa

Obtaining Data

You will need some data to visualize. You can download sample datasets from the following links:

2D Output from the STOFS-2D Global model which uses ADCIRC from here (12GB)
3D Output from the STOFS-3D Atlantic model which uses Schism 5.9 (old IO) from here (12GB)
2D Output from the STOFS-3D Atlantic model which uses Schism 5.10 (new IO) from here (3GB)

Thalassa-server

thalassa-server is an web-application leveraging the thalassa library and panel. Check-it out!

Developing

Prerequisites

For developing we are using poetry and pre-commit. You can install both with pipx:

# poetry
pipx install poetry
pipx inject poetry poetry-dynamic-versioning
pipx inject poetry poetry-plugin-export
# pre-commit
pipx install pre-commit

Install dependencies

Just run:

make init

License

The project is released under the EUPL v1.2 license which is compatible with GPL v3

thalassa's People

Contributors

Stargazers

Watchers

Forkers

pmav99 vvoukouvalas moghimis brey saeed-moghimi-noaa yosoyjay tacc chilipp wenfanwu yunfangsun oceanmodeling arnimtest seareport tomsail

thalassa's Issues

Add support for ADCIRC

It would be great if thalassa could support ADCIRC output, too.

@saeed-moghimi-noaa could you provide a sample Netcdf file?

I can see you are building something Awesome that could inspire the HoloViz Panel community. If you have the time it would be much appreciated if you would showcase your work in the HoloViz discourse Forum. Thanks.

Add TELEMAC support

TELEMAC is using its own output format called serafin or selafin. There is a GDAL driver for this format, but AFAIK there is no out-of-the-box way of opening them with xarray.

Since we would like to be able to support TELEMAC in thalassa, we should investigate:

if it is possible to open SELAFIN files using xarray
Failing that, if it is possible to convert SELAFIN files to something that xarray can handle (e.g. netcdf).

As soon as we have the data in xarray we can add a "normalization" function: https://github.com/ec-jrc/Thalassa/blob/master/thalassa/normalization.py

There is a relatively long issue discussing something similar here. I only skimmed through it but we might find something useful there.

Welcome to the Awesome List

I've added you to the awesome-list for Panel. It will be available later today when I redeploy.

https://awesome-panel.org/awesome-list?search=thalassa

(I'm a contributor to Panel and run https://awesome-panel.org/. Feel free to link up on https://twitter.com/MarcSkovMadsen and https://www.linkedin.com/in/marcskovmadsen/

Polygon-Based Data Cropping

Hi, I have a question: do we need to adjust this part of the code to use a polygon like the one below for cropping the data? Thanks!

polygon = array ([[271.09419606, 30.29241252],
[270.80483714, 30.47217758],
[270.31435613, 30.35380055],
[269.89991289, 30.0740687 ],
[269.9734827 , 29.72115943],
[270.38302128, 29.34776829],
[270.5129946 , 28.86997287],
[270.96177041, 28.87641531],
[271.38357063, 29.16376929],
[271.45468811, 29.68281716],
[271.68765915, 29.83609834],
[271.65087425, 29.96790232],
[271.09419606, 30.29241252]])

Matching dynamic color-bar and Y axis of the time series

Clean up Seareport references

Seareport is the name of the web application that we will be using internally in the JRC in order to visualize the global SCHISM model that we run. While discussing about Seareport, we inserted some references (logos, design choices etc) in the thalassa codebase. We should remove all these references.

Thalassa should be:

An API
A reference implementation of a dashboard utlizing the API.

all the rest should live in a separate repository which uses thalassa as an upstream dependency.

setup binder

Show grid (mesh) plus water depth as default background

Would you please rename grid to mesh? At NOAA, we use mesh for unstructured and grid for structured domains.
It would be great to explore rendering water depth as the background values for the mesh wireframe.

Thalassa latest version

Hi @brey @pmav99

Would you please let me know the code location of the latest stable code that you are using? Folks in our side are getting interested to look into Thalassa.

Would it be possible to update the main at JRC repo? Does pip install grabs the final version?

Best,
-Saeed

Question about support for SCHISM output file

I've been trying to use Thalassa (master branch) on some SCHISM combined output .nc files, but I keep on getting out of bound index error. The SCHISM solution is on a completely triangular mesh (no mixed elements). Is SCHISM output supported out of the box, or first I need to reindex the .nc files? (e.g 0-based vs 1-based indexing)

Consider using cf-xarray and CF-standard for the thalassa schema.

https://github.com/xarray-contrib/cf-xarray

Time series of model results at any given points

Need example for SCHSIM old I/O

Is there any example to read SCHSIM old I/O files? I could not find examples of reading multiple files like following,

../schout_000000_1.nc
../schout_000000_2.nc
../schout_000001_1.nc
../schout_000001_2.nc
../schout_000002_1.nc
../schout_000002_2.nc
../schout_000003_1.nc
../schout_000003_2.nc

I tried to open single file such as schout_000001_1.nc but then the tool complain about the lon variable. Maybe I need to read mesh first but not sure since the examples are for ADCIRC at this point.

Deployment approach

Similar to #13, for deployment there are multiple options to consider, and each of them has its own way of building:

Virtual env: uses poetry to build and then installs the wheel into the virtual env
Docker: uses poetry to build and then installs the wheel into the image
Conda: uses panel to start the Jupyter notebook

In my experience using the notebook to deploy remotely could become tricky as each connection to the port where panel is served is going to create a new server on a new dynamically chosen port. Instead it seems to be a better idea to serve the applications using the panel Python API. This way one could open a single port on the remote machine on which the single Bokeh server listens and can (?) server multiple users.

Visualizing Global Meshes

This is something that came up after investigating why the STOFS 2D Global model could not be visualized (#54). STOFS is an ADCIRC model, but the same problem should exist for SCHISM models, too, therefore I think that we should have a dedicated issue.

So, visualizing the STOFS model gives this output:

The problem is that there are elements crossing over the international time line and this messes up the visualization. The white line at ~65 latitude is because at that particular range the international line crosses the region North of Kamtchatka (not sure of the actual name).

This problem is something that we have encountered before while trying to visualize global meshes and pyposeidon does provide a fix for this and the docs do explain in a bit more detail. Applying the fix though takes significant time. So you need to do it a post-processing step. You can't do it dynamically in thalassa. At least not for these 12 million nodes files.

@brey can provide more details if necessary.

pinging @saeed-moghimi-noaa

update to bokeh 3.0

default_kwargs not recognised with custom engine

When trying to open a telemac file through thalassa,

ds = thalassa.open_dataset("input.slf", engine="selafin")

with xarray-selafin custom engine
I have this error:

  File "/home/tomsail/work/gist/animations/.venv/lib/python3.11/site-packages/thalassa/api.py", line 82, in open_dataset
    ds = xr.open_dataset(path, **(default_kwargs | kwargs))
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/tomsail/work/gist/animations/.venv/lib/python3.11/site-packages/xarray/backends/api.py", line 573, in open_dataset
    backend_ds = backend.open_dataset(
                 ^^^^^^^^^^^^^^^^^^^^^
TypeError: SelafinBackendEntrypoint.open_dataset() got an unexpected keyword argument 'mask_and_scale'

Should I add this argument (maybe even all [default kwargs](https://github.com/ec-jrc/Thalassa/blob/b9d977cd6999e73f5ad884e9e7b96d4041b60827/thalassa/api.py#L76) to my custom backend?) or should your hook be more forgiving across engines?

Decision on entrypoint to use

Right now the code has 3 entry points:

thalassa/main.py
Thalassa.ipynb
thalassa/cli.py

This can be confusing to get started with Thalassa. It would be nice if these could be consolidated into one.
Ideally instead of being static, the cli script would read a config file (e.g. xml, yaml, etc) to define what pages/apps to include.

Add installation instructions in README

Add high(er) level API for interactive usage

Instead of having to create trimesh objects etc, a user working interactively on jupyterlab should be able to just use something like this:

plot(ds, variable="elev", timestamp=ds.time[0])

Test Schism 5.10 (new IO) 3D meshes

Sample files can be downloaded from http://ccrm.vims.edu/yinglong/SVN_large_files/Scribe_IO_outputs/

Current vector plot

All,

Is any body tried vector plots using thalassa api?

Something like this:

ds['uv_mag_surface'] = (ds.uvel_surface **2 + ds.vvel_surface **2) ** 0.5
ds['uv_ang_surface'] = np.pi / 2 - np.arctan2(-ds.vvel_surface, -ds.uvel_surface)
vectorfield = gv.VectorField((ds['lon'], ds['lon'],  ds['uv_ang_surface'][1,:], ds['uv_mag_surface'][1,:]))

Any idea how to accomplish that?

Thanks,
-Saeed

Error in Plotting STOFS-3D-Atlantic Data

Hi, I am trying to use the Thalassa package to read and plot STOFS-3D-Atlantic outputs. While thalassa.open_dataset and thalassa.plot work fine with *schout_adcirc outputs, it does not plot other outputs required to visualize current output, such as stofs_3d_atl.t12z.f001_024.field2d.nc. Please see the plot below for an example. Any suggestions on how I can use thalassa.open_dataset and thalassa.plot to plot the data correctly? Thanks for your help and time.

Error when reproducing timeseries branch

All,

Is this error familiar to you?

(Thalassa) moghimi:Thalassa/ (timeseries) $ panel serve run.py [22:59:52]
2022-01-27 22:59:53,604 Starting Bokeh server version 2.4.2 (running on Tornado 6.1)
2022-01-27 22:59:53,605 User authentication hooks NOT provided (default user enabled)
2022-01-27 22:59:53,607 Bokeh app running at: http://localhost:5006/run
2022-01-27 22:59:53,607 Starting Bokeh server with process id: 87984
2022-01-27 22:59:59,048; ERROR ; [87984 - 140292251289408]; thalassa.ui ; error ; 32: ## Please select a dataset_file and click on the Render button.

-Saeed

Proposal for Thalassa v0.2

Hi @ALL

After discussing with @brey I've started to implement what, at least we hope, can be the basis for version 0.2 of Thalassa. The end result looks something like this:

This is pretty much a complete rewrite. At the moment only the "max elevation" view has been implemented, so some work is needed before we can have feature parity with version 0.1. Nevertheless, we think that:

the code is quite simpler to work with
the design is much cleaner
the UI is more intuitive
it addresses #15
it gives greater flexibility in designing the UI (e.g. it should be possible to dynamically add/remove the Wireframe by selecting a checkbox etc).

An additional feature that should be mentioned is that with the way the code has been structured, it is possible to display the visual components (i.e. the graphs) on jupyterlab, too, which can be particularly helpful when developing/debugging. it looks like this:

How to check it out?

I've pushed the code on the v0.2 branch. I've removed all the non-related code, so no binder and docker dirs etc. I've also updated the README so please check that out too.

For the record, if you have uncommitted changes in your local repo, you might find it easier to create a new git clone and a new virtualenv/conda env.

Known problems

As you will probably notice there is no WMS layer. The reason why I have removed it is the impact it had on performance. In a nutshell, holoviews seems to be doing some type of reprojection when you create an overlay that includes a WMS which seems to be really slow and which happens every time you make a change to the viewport (e.g. pan or zoom). I believe that it will be possible to avoid the slow-down but I haven't had the time to dig in deeper. For the record, this also affects Thalassa v0.1 so let's discuss this on a separate issue.

Speed up rendering of Stations

The main bottleneck seems to be the unpacking of stations.tar.gz.

The issue here is the gzip compression. gzip compresses all the files in the archive together. Consequently, even if you only need a single file from the archive you need to decompress the whole archive in order to get it.

Other formats like e.g. zip compress each file separately, thus allowing you to only decompress the files that you need

with ZipFile('spam.zip') as myzip:
    with myzip.open('eggs.txt') as myfile:
        print(myfile.read())

https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.open

Consider using cmocean for the colormaps

https://github.com/matplotlib/cmocean
https://github.com/pangeo-data/xcmocean

docs: Visualizing maximum elevation from SCHISM

I'm using API to visualize elevation outputs from SCHISM.
The timestamp enables the user to select a specific time for create_trimesh() function (i.e. '2011-08-05T22:00:00.000000000' or max which is the last date) and plots elevations for that specific time.
But is there a way to extract the maximum elevation for each point during the entire timeseries and plot the maximum elevations for the domain?

Add a way to easily create timelapses

Chrome rendering

The server deployment doesn't render on Chrome while it works on Safari & Firefox.

It is not certain if it is a security issue.

Dynamic data source selection

Right now the dataset name is hardcoded into the entrypoint of Thalassa and when the server starts, the data source cannot be modified.

The interactive dataset selection will make Thalassa much more useful. The ideal solution would have the user select dataset/file once and then uses that dataset to render plots/visualizations in every app (e.g. Elevation, Mesh, etc.).

In case this is too huge a step to take, one approach to get started is to have dataset selector within each app; for example, the user goes to Mesh app and at the top there's a "Combobox" or "File browser" to indicate what dataset to use, then after user selects a dataset/file, the content of the page refreshes with the new info.

I'm not sure how data persistence works between different apps within Panel's server. We need to explore whether there's any cookie mechanism to use or if we need to write unique data per user to the disk to persist data when moving between apps.

I am not sure if we can revert to ruamel.yaml or using the fork is important but we can elevate the issue upstream if needed.