Giter VIP home page Giter VIP logo

disp-s1's Introduction

DISP-S1

Pytest and build docker image pre-commit.ci status

Surface Displacement workflows for OPERA DISP-S1 products.

Creates the science application software (SAS) using the dolphin library.

Development setup

Prerequisite installs

  1. Download source code:
git clone https://github.com/isce-framework/dolphin.git
git clone https://github.com/isce-framework/tophu.git
git clone https://github.com/opera-adt/disp-s1.git
  1. Install dependencies, either to a new environment:
mamba env create --name my-disp-env --file disp-s1/conda-env.yml
conda activate my-disp-env

or install within your existing env with mamba.

  1. Install tophu, dolphin and disp-s1 via pip in editable mode
python -m pip install --no-deps -e dolphin/ tophu/ disp-s1/

Setup for contributing

We use pre-commit to automatically run linting, formatting, and mypy type checking. Additionally, we follow numpydoc conventions for docstrings. To install pre-commit locally, run:

pre-commit install

This adds a pre-commit hooks so that linting/formatting is done automatically. If code does not pass the checks, you will be prompted to fix it before committing. Remember to re-add any files you want to commit which have been altered by pre-commit. You can do this by re-running git add on the files.

Since we use black for formatting and flake8 for linting, it can be helpful to install these plugins into your editor so that code gets formatted and linted as you save.

Running the unit tests

After making functional changes and/or have added new tests, you should run pytest to check that everything is working as expected.

First, install the extra test dependencies:

python -m pip install --no-deps -e .[test]

Then run the tests:

pytest

Optional GPU setup

To enable GPU support (on aurora with CUDA 11.6 installed), install the following extra packages:

mamba install -c conda-forge "cudatoolkit=11.6" cupy "pynvml>=11.0"

Building the docker image

To build the docker image, run:

./docker/build-docker-image.sh --tag my-tag

which will print out instructions for running the image.

disp-s1's People

Contributors

scottstanie avatar pre-commit-ci[bot] avatar ehavazli avatar mirzaees avatar gmgunter avatar liangjyu avatar rtburns-jpl avatar

Stargazers

Kousik Biswas avatar  avatar Talib Oliver avatar  avatar  avatar

Watchers

Heresh Fattahi avatar  avatar Xiaowen Wang avatar  avatar  avatar

disp-s1's Issues

Add an ampltiude pre-processor to convert saved mean/dispersion into one input per burst

After we move to storing means/ dispersions in the CCSLCs, we need to convert them into single rasters for the current PS input format. The mean will use something like

$$ \mu_{total} = (\sum_{i=1}^k \mu_i N_i) / (\sum_{i=1}^k N_i) $$

with similar idea for $\sigma^2_{total}$.

To minimize changes to the SAS, we can run a function on each CCSLC burst stack and produce one amplitude_mean.tif and one amplitude_dispersion.tif per burst.

Run product creation in parallel

Right now the ~20 netcdf products + compressed SLC creation takes about 20 minutes. It's largely IO, but can likely be sped up at least by 2x using ThreadPoolExecutor

Create a data bounding polygon, add to `identification/bounding_polygon`

The CSLC product has a /identification/bounding_polygon dataset, where they include WKT for the nodata boundary.

Possible implementation on example GSLC

gdal_calc.py -A NETCDF:t051_109451_iw3_20190329.h5:/data/VV --type Byte --co "COMPRESS=DEFLATE" --outfile not_nan_mask.tif --quiet --calc " ~numpy.isnan(A)"
gdal_polygonize.py -8 not_nan_mask.tif not_nan_polygons.shp 

Other questions

  • Do we also want just the 4 corners? Take the convex hull?
  • How accurate do we care to make this?
  • Should we buffer some number of pixels?

Create a database of large (M>6) earthquakes in our AOI

At Fringe, it was mentioned that EGMS pulling in a list earthquakes M>6.0 from USGS. This will be relevant to check how often this leads to a huge surface displacement (depends on depth), as well as how the algorithm can handle changing reference dates.

Add "average seasonal coherence" as an input to the interface

Charlie has given a nice easy-to-use code example to pull the Josef global coherence dataset over an area of interest

https://github.com/OPERA-Cal-Val/s1-coherence-2020/blob/main/3_Seasonal_Coherence_View.ipynb

  • Figure out where to add this in the PgeRunconfig class
  • Grab the sample data for the delivery example
  • Update the description of this in the user guide

Reference date selection

  • Do a test over Alaska to determine if this tells us the correct time to pick as a reference date

Background

For each tile, seasonal composites of the coherence at different repeat intervals and backscatter imagery were calculated. We calculated the median coherence based on all coherence estimates per tile of a given repeat interval (6, 12, 18, 24, 36, and 48) per three-month period: 1) December, January, February 2) March, April, May 3) June, July, August, and 4) September, October, November. We chose the median operation to account for outliers. In the case of the backscatter intensity products, we calculated per three-month period the average backscatter intensity in VV and VH, or HH and HV, polarization.

The decay of coherence with increasing repeat interval was modelled for each season at pixel-level with the exponential model38
𝛾𝑑(𝑑)=(1βˆ’πœŒβˆž)π‘’βˆ’π‘‘/𝜏+𝜌∞
(3)

where ρ∞ and Ο„ denote the long-term coherence and rate of coherence decay with increasing repeat interval, respectively.

Including masks as product layers

  1. We need to store any combined masks we used used for unwrapping (e.g. nodata, plus water mask, plus pixels < coherence threshold)
  2. We may consider adding another "suggested mask" so that people can easily apply one layer to the unwrapped phase to get an output. We dont want to have to explain how to combine 2-4 conditions to recreate what are the "good pixels" to look at

Changes to `product.py` for updated product layout

Create configuration to run faster on a large machine

SDS has requested another config file for us/them to test on 32 or 64 CPU machines.
We'll need to adjust the parallelism of the stages. Assuming we have 128 GB of ram, we can try

  • Wrapped phase: n_parallel_bursts = 27
    • Running 9 parallel bursts led to ~30 GB of RAM usage.
  • Unwrapping: (multiple options for this)
    • n_parallel_tiles: 8
    • n_parallel_jobs: 5
      (TBD: exact math on how much RAM snaphu uses for different tiles shapes/sizes.)

One possible idea to aid them: make a few "standard configurations" or "base configurations" (similar to the idea here, where you just specify 'big'. We could have a 16cpu configuration, a 64 cpu one, etc.

Skip the runconfig dtype check in `validate`

@collinss-jpl noted that SDS will often make small minor change to the PGE runconfig, which may/should not affect the test. But right now the validation fails since the string length is different:

disp_s1.validate.ComparisonError: /metadata/pge_runconfig dtypes do not match: |S10634 vs |S10630

We should remove this check and just log any difference

Choosing a reference point and recording it

  • Pick one of the PS points,
    • Possible way: take the biggest connected component. Take lowest amplitude dispersion PS pixel within that.
  • Reference the output products to this.
  • Record it in the product here: https://github.com/opera-adt/disp-s1/blob/main/src/disp_s1/product.py#L193-L209
    • Note on this format: currently the scalar "value" is the phase that is subtracted from the image. The attrs of the dataset have the rows/cols. This would just be a list of one row/one col for a single PS point.

Compute solid earth tide correction, add layer

Replace "0"s made in the unwrapping output with the original wrapped phase value

Since we don't want to make it impossible for a better unwrapping job, we'd rather keep the original wrapped phase where the unwrapper gave up.

  • Make sure we reset all data originally in the interferogram (both masked for unwrapping, and badly unwrapped areas)
  • Ensure that the edges haven't gotten slightly off of the nodata value. Every frame should be nans around the outside, and something nonzero inside
  • Make sure we have some clear mask layer which says "we think you should probably ignore these areas"

Adding a `time` dimension to the NetCDF product

This is up for discussion: We should possible add a /time dimension to each product to explicitly label the displacement time. We should be able to follow the CF-conventions (which do specify that the name should be "time") by using h5netcdf.

  • downsides:
    • shape becomes (1, rows, cols) so some things will complain, e.g. if you directly plot the (1, rows, cols) image (This already happens when you use rasterio to load a 1-band image)
  • upsides:
    • concatenating multiple becomes more straightforward because the time dimension is explicitly encoded
    • everyone doesn't have to write a string/name parser (i.e. you can just do xr.open_mfdataset() and it should just work)

Create spatial baseline cubes from the CSLC input orbits

New validation checks

On top of the existing golden output validation script

  • A check if the actual bounds of the file match the expected bounds from the frame_to_burst json
  • Ensure the unwrapped phase values are congruent with the wrapped phase values

(...todo: figure out what better checks to do on the results).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.