Giter VIP home page Giter VIP logo

echoregions's People

Contributors

ctuguinay avatar dependabot[bot] avatar leewujung avatar ngkavin avatar pre-commit-ci[bot] avatar valentina-s avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

echoregions's Issues

Update Shapely version to >=2.0.0 or unpin it

shapely dependency (which is a dependency of regionmask) is pinned to 1.8.2, since 2.0.0 version causes some issue (ValueError: Inconsistent coordinate dimensionality). regionmask==0.9.0 seems to require >1.7.0. Going forward the code should be adjusted to the later version, and possibly sort out backward compatibility.

Multiple Line Plotting demonstration notebook.

Read several .evl files and combine the bottoms to plot on one plot. There is some winter observations in each .evl file so there maybe some redundance and it may require more work to get the start and date of the transect.

Review design `Region2D` and other related modules

In reviewing #45 I had some comments on the current structure of Region2D and related modules.

That PR was merged so that we can move forward on the project, so this issue is a reminder that we should revisit these.

The comments are reproduced below:

  • review the use case of Region2D to have clarity on whether we should just parse EVR file at init, or change the input to accept both EVR and CSV/JSON that were converted previously.
  • convert_points:
    • currently lives in evr_parser.py but I think it should just live in Region2D
    • it is currently unused. From your comment above it seems that some more work is needed to smooth out the points.
  • get_points_from_region: this function does not currently work since there is no get_points_from_region under Regions2DPlotter
  • Region2DPlotter: when should it be initialized?
  • add docstring for Regions2DMasker.mask (from Region2D.mask

Region Masking Functionality

The mask function of the Regions2D object is not working currently.

The actual functionality is in the mask method of the Regions2DMasker

Working example for making "Hake" region masks (only one region) in this notebook.

Some considerations:

  • input could be one or many regions (based on region id's)
  • user wants all regions for a given label to be in the same mask
  • user want regions with different labels to be layers in the mask dataset
  • user wants regions with different labels to be in one layer (for example they want to combine all fish regions into one layer)
  • provide some options to the user how to store the pixel values of the masked regions

A few extra details:

  • adding regionmask version as a requirement
  • adding init.py
  • add examples in notebooks
  • add examples selecting different regions based on labels

Make output of `Regions2D.select_sonar_file` a list.

Now if there is just one file the output is the name of that file. If expecting a list one can loop through the files, but if it is only a string, one would loop by mistake over the characters in the string. It will be simpler to always expect a list. Maybe change to select_sonar_files?

Finding files based on metadata of the sonar files.

Currently, finding the sonar files of echoregions relies on the data in the name of the files. This is fine for our hake survey use case, but it eventually we should make it work with the metadata from the sonar file (opened without specifying any group), so that it is file name independent.

Interpolation for Line Plot

A possible feature could be using interpolation to connect some of these dots. More specifically, the interpolation algorithm would have to generate associated timestamp and depth values.

line_plot

Reorganize tests

Right the different tests are sprinkled across 3 .py files in the tests folder without clear groupings.
Let's reorganize them (and add new tests as needed) based on the functionalities being tested, as we move forward to add test data into the repo (#25).

Fix import redundancy

After the major clean up of unused imports in #24, there are still some redundancy in the import statements that need to be fixed.

Below I list the two I saw while going through the import statement issues:
https://github.com/leewujung/echoregions/blob/6734119e55ee1f6fab997a57e8467d2b3dfe1822/echoregions/__init__.py#L1-L5
Under echoregions.convert the CalibrationParser and read_* functions are also imported.
Do we want the users to invoke them at the root level, or as part of the convert subpackage?

https://github.com/leewujung/echoregions/blob/6734119e55ee1f6fab997a57e8467d2b3dfe1822/echoregions/tests/test_ev_parser.py#L5-L6
parse_time is separately imported here but it is actually imported at the root level in echoregions/init.py above already.
Again do we want the users to invoke them at the root level?

Add echoregions to conda

It'll be good to add echoregions to conda, and have that triggered by GitHub releases. This will improve our overall workflow and ML work.

Use `pathlib` instead of `os`

Some parts of the codebase use pathlib but other parts use os for path handling. Some other parts use pure strings. Let's clean up everything to use pathlib.

Warnings that arose from PR #81

Warnings that arose from PR #81

image

Text:

=============================== warnings summary ===============================
echoregions/tests/test_r2d.py::test_mask_no_overlap
:241: RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject

echoregions/tests/test_r2d.py::test_mask_no_overlap
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/pkg_resources/init.py:121: DeprecationWarning: pkg_resources is deprecated as an API
warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)

echoregions/tests/test_r2d.py::test_mask_no_overlap
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits').
Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)

Line masking problems

Making bottom masks from .evl files requires to interpolate the bottom curve points to the grid of the sonar file. The choice of the interpolation scheme affects the result. We should provide option to select different interpolation schemes, and the user can change the parameters so this works for different datasets.

Version for Hake Survey is in this notebook

Add Line Masking Functionality

As of right now, there exists no such Line Mask in echoregion modules. We need line masking in order to get precise depth values to better Hake ML biomass calculations. To resolve this issue, a function line.mask(Sv, interp options) must be created that should include the following:

  • filter time start/end based on Sv
  • basic interp that always occur using pandas df interp options
  • make mask

This implementation will repurpose the following code in the notebook created by Valentina Staneva as Hake Bottom Interpolation.
Sufficient implementation of this function will resolve this issue and those of #43, #82.

Echoregions functionality notebook

A notebook describing the different attributes and methods of the objects in echoregions (regions & lines).

  • UML diagram
  • regions functionality
  • lines functionality

Geared toward technical people who would want to modify it/extend it.

Store different labels to be layers in the mask dataset

On way to achieve that is through 3D array mask, or through stacking them as variables. One also generate them separately and stack the. Regionmask has the option mask_3D, which is binary. One can also generate them by looping through the region types.

Multiple Region plotting use case.

  • read through a folder of .evr files.
  • For each region in the list of regions across files
    - find the corresponding sonar data files
    - plot the region superimposed on the background of the sonar .sv data
    - save a .png

Note the .png's will be of different size since the regions are of different time spans.

Create first release for echoregions

It'll be good in the not too distant future to create a release for echoregions that contains stable basic functionalities. We can set up the releases to publish directly to PyPI and triggered a conda build (#67).

IMHO we should do this after resolving things in #54 so things are not that confusing.

Tasks

  • Set up webhook for Zenodo
  • Do slight refactoring so as to match what is found in Scientific Python and Python Packaging
  • Create release on GitHub
  • Submit to test PyPI
  • Submit to official PyPI
  • Ensure can pip install directly from PyPI
  • Update pip install instructions in docs to use pypi

Determine Earliest Working Regionmask Version

  • determine earliest regionmask working version
  • Masking function does not work with 0.7.0 (one gets unix_time instead of ping_time; need to add tests to catch this error). 0.9.0 works.

Add functionality to create masks for within-transect (good regions)

This can take several forms but it slightly depends on how people have annotated individual breakpoints:

  • start transect
  • break transect
  • resume transect
  • end transect

Usually those should be annotated with log lines, but sometimes it can be thin boxes, or within-transect boxes. One approach is to create an intermediate table with time stamps for the staring and ending times and create the mask based on it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.