Giter VIP home page Giter VIP logo

pymccrgb's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pymccrgb's Issues

Update classification pipeline

  • Increase max_iter for convergence
  • Reduce n_components for better memory usage
  • Evaluate parameters on test dataset
  • Refactor pipeline argument handling
  • Improve memory usage by predicting on ground points only
  • Parallelize prediction with joblib
  • Evaluate tiling data or using dask-ml to parallelize prediction

Add CLI

Add scripts:

  • mccrgb.py
  • mcc.py

which take the following arguments

Name Description
-i input Input filename
-o output Output filename
-s, --scales List of scale domains
-t, --tols List of tolerances
-trs, --training_scales List of MCC-RGB update scale domains
-trt, --training_tols List of MCC-RGB update tolerances
--max_iter Maximum number of iterations
--downsample If true, downsample point cloud before spline interpolation

JOSS review: Fix linter failures

A final side note: I believe that a linter such as flake8 should be used in order to prevent bugs and conform to Python standards. From the root of the repository (commit d97450d), the call to flake8 returned the following (note that line ./pymccrgb/pointutils.py:28 of crop_to_polygon would raise a NameError):

./setup.py:11:80: E501 line too long (92 > 79 characters)
./setup.py:38:80: E501 line too long (80 > 79 characters)
./docs/source/conf.py:58:1: E302 expected 2 blank lines, found 1
./docs/source/conf.py:86:1: E402 module level import not at top of file
./docs/source/conf.py:87:1: E402 module level import not at top of file
./docs/source/conf.py:156:1: E402 module level import not at top of file
./pymccrgb/core.py:90:1: W293 blank line contains whitespace
./pymccrgb/core.py:94:1: W293 blank line contains whitespace
./pymccrgb/core.py:103:1: W293 blank line contains whitespace
./pymccrgb/core.py:205:72: W291 trailing whitespace
./pymccrgb/core.py:211:22: W291 trailing whitespace
./pymccrgb/core.py:222:1: W293 blank line contains whitespace
./pymccrgb/core.py:318:80: E501 line too long (87 > 79 characters)
./pymccrgb/core.py:339:80: E501 line too long (83 > 79 characters)
./pymccrgb/__init__.py:1:1: F401 '.core' imported but unused
./pymccrgb/__init__.py:1:1: F401 '.datasets' imported but unused
./pymccrgb/__init__.py:1:1: F401 '.features' imported but unused
./pymccrgb/__init__.py:1:1: F401 '.ioutils' imported but unused
./pymccrgb/__init__.py:1:1: F401 '.pointutils' imported but unused
./pymccrgb/__init__.py:1:1: F401 '.plotting' imported but unused
./pymccrgb/__init__.py:2:1: F401 '.core.mcc' imported but unused
./pymccrgb/__init__.py:2:1: F401 '.core.mcc_rgb' imported but unused
./pymccrgb/__init__.py:3:1: F401 '.ioutils.read_data' imported but unused
./pymccrgb/__init__.py:3:31: W291 trailing whitespace
./pymccrgb/ioutils.py:10:15: E225 missing whitespace around operator
./pymccrgb/ioutils.py:12:1: E302 expected 2 blank lines, found 1
./pymccrgb/ioutils.py:39:80: E501 line too long (80 > 79 characters)
./pymccrgb/ioutils.py:43:80: E501 line too long (80 > 79 characters)
./pymccrgb/ioutils.py:46:80: E501 line too long (100 > 79 characters)
./pymccrgb/ioutils.py:51:80: E501 line too long (81 > 79 characters)
./pymccrgb/ioutils.py:63:22: W291 trailing whitespace
./pymccrgb/ioutils.py:129:5: F841 local variable 'count' is assigned to but never used
./pymccrgb/ioutils.py:159:80: E501 line too long (117 > 79 characters)
./pymccrgb/ioutils.py:160:9: E131 continuation line unaligned for hanging indent
./pymccrgb/ioutils.py:170:5: F841 local variable 'count' is assigned to but never used
./pymccrgb/ioutils.py:183:80: E501 line too long (88 > 79 characters)
./pymccrgb/ioutils.py:184:9: E131 continuation line unaligned for hanging indent
./pymccrgb/ioutils.py:192:5: F841 local variable 'count' is assigned to but never used
./pymccrgb/pointutils.py:5:1: F401 'pdal' imported but unused
./pymccrgb/pointutils.py:28:27: F821 undefined name 'poly'
./pymccrgb/pointutils.py:78:1: W293 blank line contains whitespace
./pymccrgb/pointutils.py:90:80: E501 line too long (111 > 79 characters)
./pymccrgb/pointutils.py:113:80: E501 line too long (104 > 79 characters)
./pymccrgb/pointutils.py:149:80: E501 line too long (99 > 79 characters)
./pymccrgb/classification.py:36:22: W291 trailing whitespace
./pymccrgb/classification.py:50:80: E501 line too long (88 > 79 characters)
./pymccrgb/colorize.py:3:1: F401 'pdal' imported but unused
./pymccrgb/plotting.py:10:1: F401 'mpl_toolkits.mplot3d.Axes3D' imported but unused
./pymccrgb/plotting.py:75:80: E501 line too long (83 > 79 characters)
./pymccrgb/plotting.py:110:80: E501 line too long (80 > 79 characters)
./pymccrgb/plotting.py:189:80: E501 line too long (83 > 79 characters)
./pymccrgb/plotting.py:196:80: E501 line too long (81 > 79 characters)
./pymccrgb/plotting.py:202:12: F821 undefined name 'updated'
./pymccrgb/plotting.py:205:5: F841 local variable 'ymax' is assigned to but never used
./pymccrgb/plotting.py:207:80: E501 line too long (81 > 79 characters)
./pymccrgb/datasets.py:46:80: E501 line too long (92 > 79 characters)
./pymccrgb/tests/test_features.py:5:1: F401 'pytest' imported but unused
./pymccrgb/tests/test_features.py:29:80: E501 line too long (80 > 79 characters)
./pymccrgb/tests/test_features.py:44:80: E501 line too long (83 > 79 characters)
./pymccrgb/tests/test_features.py:51:80: E501 line too long (81 > 79 characters)
./pymccrgb/tests/context.py:4:80: E501 line too long (85 > 79 characters)
./pymccrgb/tests/context.py:6:1: E402 module level import not at top of file
./pymccrgb/tests/context.py:6:1: F401 'pymccrgb' imported but unused
./pymccrgb/tests/test_core.py:5:1: F401 'pytest' imported but unused
./pymccrgb/tests/test_core.py:29:80: E501 line too long (83 > 79 characters)
./pymccrgb/tests/test_core.py:35:80: E501 line too long (99 > 79 characters)
./pymccrgb/tests/test_core.py:94:80: E501 line too long (87 > 79 characters)

openjournals/joss-reviews#1777

Add datasets submodule

  • Add function - load text files
  • Add function - load LAZ files
  • Add HSL dataset - lidar
  • Add HSL dataset - sfm
  • Add KP dataset - colorized lidar

Add tests

  • Features tests
  • MCC test on sample lidar
  • MCC-RGB tests:
    • One veg. scale
    • Two veg. scales

lidar and sfm data?

Refactor IO to use laspy

Right now, write_pdal and write_las use temporary files to save a numpy array as a LAS file. This is bad because

  • The temp file may be huge, and take a long time to write
  • This doesn't keep any spatial reference information

Instead, write_las should use laspy's File object,to write points and copy a header. This means that read_las should also use laspy, and return a header to copy.

Reorganize submodules for user importing of API

This should work:

from pymccrgb import load_data, mcc, mcc_rgb
X = load_data('mydataset.txt')
y_mcc = mcc(X)
y_mccrgb = mcc_rgb(X)
  • Consider API design - do we want MCCClassifier and MCCRGBClassifier objects?
    • This makes less sense for MCC because it's purely iterative. Could work for MCCRGB
  • Implement API functions or wrapper where nec.
  • Change top-level imports in __init__.py

JOSS review: Download and cache datasets

Finally, I wonder whether it is essential to include a 21.6 MB example data which practically accounts for the totality of the size of the Python wheel or source tar. I believe that it would be better to include such example data in a separate repository (e.g., with the example notebooks there instead of in docs/source/examples) or hosted on an s3 instance so they are downloaded (and maybe cached locally) once the users call the load_ methods of the dataset module.

openjournals/joss-reviews#1777

Fix IO bugs due to input assumptions

There are a couple of IO bugs due to poor input handling.

  • usecols does not behave correctly when reading a LAS/LAZ file because it performs selection on a labeled array
  • write_las fails silently (writes empty file) when passed an XYZ (rather than XYZRGB) array

Some of these could be resolved by using laspy (#15)

Support user-defined training ranges

  • Accept training_scales and training_tols lists
  • Insert into MCC scales and tols lists if appropriate
  • Reclassify at these intervals in mcc_rgb
  • Fix updated index list to track multiple update steps
  • Add tests

JOSS review: Improve installation

Installation: while I could successfully install the package by following the instructions of the documentation, I believe that the procedure is a bit too convoluted, since it requires cloning the source repository just for the environment file and then suggests to install the package from PyPI when, given that the repository has already been cloned, the package could be installed as in pip install .. I would strongly suggest the authors to create a conda recipe so that the package con be installed from conda-forge without the need for cloning the source repository.

openjournals/joss-reviews#1777

Update calls to pymcc

The pymcc interface changed:

  • Remove extra arguments from calls to mcc functions
  • Test changes
  • Cut new release of pymcc on pypi

Package code and Cython dependencies

  • Write environment and setup files
  • Sort out installation and compilation of pymcc Cython wrapper via git
  • Test on Linux
  • Test on machine without system install of gcc, boost, liblas
  • Test install on Travis

Update test data hosting

Data used by the example load_... functions was originally stored in a personal AWS bucket. Usage increased and I got billed for more than expected in Jan.

  • Move test data to academic hosting
  • Update base URL in datasets.py

Write interface for tiled processing

It would be good to process a large point cloud as tiles with the same scale domain/height tolerance.

It might be best to implement Pointcloud and TiledPointcloud classes for this. Major features could be

  • Load a dataset (.load()) as a dask array
  • Read data into memory by tile location
  • Run MCC/MCCRGB on tile
  • Save point labels as they are returned
  • Ideally this could be accomplished with dask workers via mcc or mcc_rgb.

Add colorize utility

  • Implement a utility function to project color values from an image or orthomosaic onto a point cloud (wraps PDAL's filters.colorization)
  • Add example with NAIP imagery

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.