osoceanacoustics / echoregions Goto Github PK
View Code? Open in Web Editor NEWInterfacing water column sonar data with annotations and labels
Home Page: https://echoregions.readthedocs.io/
License: Apache License 2.0
Interfacing water column sonar data with annotations and labels
Home Page: https://echoregions.readthedocs.io/
License: Apache License 2.0
shapely dependency (which is a dependency of regionmask) is pinned to 1.8.2, since 2.0.0 version causes some issue (ValueError: Inconsistent coordinate dimensionality
). regionmask==0.9.0 seems to require >1.7.0. Going forward the code should be adjusted to the later version, and possibly sort out backward compatibility.
Read several .evl
files and combine the bottoms to plot on one plot. There is some winter observations in each .evl
file so there maybe some redundance and it may require more work to get the start and date of the transect.
In reviewing #45 I had some comments on the current structure of Region2D
and related modules.
That PR was merged so that we can move forward on the project, so this issue is a reminder that we should revisit these.
The comments are reproduced below:
Region2D
to have clarity on whether we should just parse EVR file at init, or change the input to accept both EVR and CSV/JSON that were converted previously.convert_points
:
evr_parser.py
but I think it should just live in Region2D
get_points_from_region
: this function does not currently work since there is no get_points_from_region
under Regions2DPlotter
Region2DPlotter
: when should it be initialized?Regions2DMasker.mask
(from Region2D.mask
The mask
function of the Regions2D
object is not working currently.
The actual functionality is in the mask
method of the Regions2DMasker
Working example for making "Hake" region masks (only one region) in this notebook.
Some considerations:
A few extra details:
Now if there is just one file the output is the name of that file. If expecting a list one can loop through the files, but if it is only a string, one would loop by mistake over the characters in the string. It will be simpler to always expect a list. Maybe change to select_sonar_files
?
Some existing tests are failing at pandas >=2.0, let's make sure both the requirements.txt
is updated and fix test issues so that all tests runs.
Currently, finding the sonar files of echoregions relies on the data in the name of the files. This is fine for our hake survey use case, but it eventually we should make it work with the metadata from the sonar file (opened without specifying any group), so that it is file name independent.
Test and update Lines_plotting.ipynb
notebook.
Related to #59.
The Regions2D_functions.ipynb
notebook is likely outdated. Need to test what does or does not work, and update the notebook.
Right the different tests are sprinkled across 3 .py files in the tests
folder without clear groupings.
Let's reorganize them (and add new tests as needed) based on the functionalities being tested, as we move forward to add test data into the repo (#25).
Read 1 region from 1 .evr file. Find the overlapping sonar files, and plot the region superimposed on them.
After the major clean up of unused imports in #24, there are still some redundancy in the import statements that need to be fixed.
Below I list the two I saw while going through the import statement issues:
https://github.com/leewujung/echoregions/blob/6734119e55ee1f6fab997a57e8467d2b3dfe1822/echoregions/__init__.py#L1-L5
Under echoregions.convert the CalibrationParser and read_* functions are also imported.
Do we want the users to invoke them at the root level, or as part of the convert
subpackage?
https://github.com/leewujung/echoregions/blob/6734119e55ee1f6fab997a57e8467d2b3dfe1822/echoregions/tests/test_ev_parser.py#L5-L6
parse_time
is separately imported here but it is actually imported at the root level in echoregions/init.py above already.
Again do we want the users to invoke them at the root level?
During testing, xarray==0.16.2
raises nanosecond encoding error: pydata/xarray#4400
It seems it is resolved in xarray==2023.2.0
. Need to identify a minimum cutoff version and set it in requirements-dev.txt
#57 converts only masking functionality to new sv format, not line functionality. Update that with new files and example notebooks.
Currently the test data used in the tests are not in the repo. They are should be added.
Add functionality and examples to read directly from the cloud.
It'll be good to add echoregions to conda, and have that triggered by GitHub releases. This will improve our overall workflow and ML work.
Some parts of the codebase use pathlib
but other parts use os
for path handling. Some other parts use pure strings. Let's clean up everything to use pathlib
.
Warnings that arose from PR #81
Text:
=============================== warnings summary ===============================
echoregions/tests/test_r2d.py::test_mask_no_overlap
:241: RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 16 from C header, got 96 from PyObject
echoregions/tests/test_r2d.py::test_mask_no_overlap
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/pkg_resources/init.py:121: DeprecationWarning: pkg_resources is deprecated as an API
warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning)
echoregions/tests/test_r2d.py::test_mask_no_overlap
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits')
.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace
. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
evr_parser
is using df.append
to append rows. Append to list and concat instead.
Making bottom masks from .evl
files requires to interpolate the bottom curve points to the grid of the sonar file. The choice of the interpolation scheme affects the result. We should provide option to select different interpolation schemes, and the user can change the parameters so this works for different datasets.
Version for Hake Survey is in this notebook
A notebook which demonstates how to read one .evl
file and plot the line on top of sonar data, and export the line into a .csv
format.
As of right now, there exists no such Line Mask in echoregion modules. We need line masking in order to get precise depth values to better Hake ML biomass calculations. To resolve this issue, a function line.mask(Sv, interp options) must be created that should include the following:
This implementation will repurpose the following code in the notebook created by Valentina Staneva as Hake Bottom Interpolation.
Sufficient implementation of this function will resolve this issue and those of #43, #82.
Make sure this notebook is up to date with the newer Sv format.
https://github.com/OSOceanAcoustics/echoregions/blob/main/notebooks/Lines_plotting.ipynb
A notebook describing the different attributes and methods of the objects in echoregions (regions & lines).
Geared toward technical people who would want to modify it/extend it.
Regions2D_masking.ipynb
is built on top of Regions2D_plotting.ipynb
, so test and update Regions2D_plotting.ipynb
first.
On way to achieve that is through 3D array mask, or through stacking them as variables. One also generate them separately and stack the. Regionmask
has the option mask_3D
, which is binary. One can also generate them by looping through the region types.
.evr
files..sv
data.png
Note the .png's will be of different size since the regions are of different time spans.
It'll be good in the not too distant future to create a release for echoregions that contains stable basic functionalities. We can set up the releases to publish directly to PyPI and triggered a conda build (#67).
IMHO we should do this after resolving things in #54 so things are not that confusing.
regionmask
working versionunix_time
instead of ping_time
; need to add tests to catch this error). 0.9.0 works.This can take several forms but it slightly depends on how people have annotated individual breakpoints:
Usually those should be annotated with log lines, but sometimes it can be thin boxes, or within-transect
boxes. One approach is to create an intermediate table with time stamps for the staring and ending times and create the mask based on it.
This file seems no longer used? Can we remove it as we move forward to tidy up the repo?
Might get too many issues, but it will be easier to find #TODO
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.