digitalslidearchive / histomicstk Goto Github PK

View Code? Open in Web Editor NEW

381.0 381.0 113.0 131.14 MB

A Python toolkit for pathology image analysis algorithms.

Home Page: https://digitalslidearchive.github.io/HistomicsTK/

License: Apache License 2.0

Python 96.79% CMake 0.20% Shell 0.14% Dockerfile 0.45% Cython 2.42%

bioimage-informatics computer-vision digital-slide-archive histology machine-learning medical-image-processing python

histomicstk's People

Contributors

Stargazers

Watchers

Forkers

salamb manthey directorscut82 gitter-badger jiteshjha predicative jcfr agirault itcthienkhiem ac3957 solversa chibuta cnh lingdaosha abcsfrederick jason-weirather nick917 anu-bioinfo smrjans sraja2911 fbudin69500 eddienko tymiao1220 doanle0906 bostonmeditechgroup haillab biomedicalmachinelearning tony32769 gengyuanzhe shenqianwithc vallurumk kidsblue spotligtht rubedolife translationalbiophotonics daliborstuchlik devhliu pathologydatascience yfq512 yaohuaxin bhargavaganti biotech-dev juliejsanchez jia-honghenrylee phiwei hnakhoul shulp2211 skyclub3 avinash0309 sumanthratna sarderlab hettiepath liangzhendong123 dregula cpufxb gregster7 leengit canqiangxu apple3c doori msk-mind davidsoong zfxu jonathanwf habibmrad do-ai p829911 kevinmtian dsouzavijeth abnsy dgonzmd yujing1997 philloidin ayulove junhwanjang wagaskar sunaifei amarucla jichangchao azmhmd huangzhii fdossi python-repository-hub yang20085936 touchmed avandalton idc9 histopathology yangsenwxy anilgavade textbuk lythimus annabator bkf15 xuyuan2023 adityatg-asteria gilvbp estelle900 zibagandomkar subinkitware

histomicstk's Issues

Setup CMake based testing and add some documentation

@brianhelba You can add the documentation for testing here. I will find a good place to move it to when we organize our documentation later.

Check matching input sizes in EmbedBounds

EmbedBounds does not throw error when mask/label and intensity images are not same size.

Poisson mixture model cuts

Implement poisson mixture modeling and binary pygco graph cuts for foreground/background segmentation.

Slicer XML based command-line argument parser

Create a slicer xml based command-line argument parser using argparse to enable developers to write slicer execution model CLIs in python.

Once this is done developers will be able to write CLIs in python once which can then be run both on command-line (the in-line python script CLIs that we have now wont allow this) and called over the web via REST end point.

Use comma separated values for vector parameters in a CLI

@jbeezley
In the front-end, use a single text box for vector parameters (integer-vector, float-vector, double-vector, string-vector) in a CLI and expect the user to provide a comma separated list of values

Remove dependency on matplotlib

I don't think we need to require matplotlib for installing histomicstk. This is causing issues installing on our cluster as it tries to pull in all kinds of other packages. People are free to install this independently and there are other options for viewing images.

Switch from romanesco to girder_worker

@brianhelba .travis.yml needs to be updated to switch from romanesco to girder_worker

3rd party nuclear segmentation with pygco

Implement end-to-end nuclear segmentation pipeline using the following components:
-Whole-slide Reinhard normalization
-Poisson mixture models / binary graph cuts
-Constrained log (cLoG) splitting to produce label image
-Multilabel graph cuts refinement to refine label image

Address memory limits in Sample.py

Sample attempts to load a low-res version of the entire slide to generate a tissue mask. If a 1.25X magnification is not provided by the WSI then this could result in loading an entire slide at 5X or high magnification.

Modify Auto CLI REST endpoint code to run standalone CLIs using Docker in girder_worker

Initial feature set - implementation

Installer is trying to download matplotlib each time I run setup.py

Create a stable branch for HistomicsTK and host it on girder.neuro.emory.edu

The stable branch should contain the most recent stable version of HistomicsTK that we can show to people.

Implement CLI for segment nuclei and generate annotations for them

Fill values distort model fitting procedures

The fill values used when requesting image tiles from OpenSlide can produce errors in building models to distinguish tissue from glass. Need to modify code to exclude pixels from filled areas that are added to create an integral-sized tiling.

Pass values of CLI parameters in json.dumps format while running a CLI through front-end

While running a CLI through the histomicstk front end (localhost:808/histomicstk#<cli-name>), pass values of CLI parameters in the POST Request are in the format outputted by json.dumps. I use json.loads to convert them back to a value of their python type.

Install requirements_c.txt also in .travis.yml

@brianhelba should we also be installing requirements_c.txt in .travis.yml.

I had to separate all packages that depend on c libraries into a separate file for ReadTheDocs to work. The issue is described here. I had to add mocks for them in docs/conf.py and exclude them from install_requires in setup.py to get the documentation to build without failure on ReadTheDocs. Dont know if there is a better way?

Develop front-end for DSA

The front end should have ways to:

select data from girder
viewer for visualizing whole-slide images and annotations/results
select and run analysis CLIs exposed in the server directory of histomicsTK

Installing ITK with python as a package for ubuntu

This is ITK v4.7 and I just verified on Monday that this works.

https://github.com/zachmullen/girder_training/blob/master/ansible/vagrant_playbook.yml#L17-L23

Then you can apt-get install insighttoolkit4-python

Upgrade to GeoJS 0.9

This contains zoom animations, which are an important feature.

Implement AdaptiveColorNorm

Currently this is empty.

Fix the generation of API documentation on ReadTheDocs

Clarify parameter names in EmbedBounds

Cycle in MaxClustering

There is a cycle in trajectory tracking for MaxClustering that causes an endless loop. Need to check the edge values of tracking matrices for correctness.

Fix seeding of elongated cells in GaussianVoting.py

Elongated cells are not being detected properly by GaussianVoting. Their symmetry is more axial than radial, and the votes are too diffuse to be detected as seed locations.

Fix optimization routine in SimpleMask.py

Optimization routine 'fmin_slsqp' from scipy is stopping early before convergence. Need to examine stopping criteria in documentation and do some more extensive testing for constrained curve fitting of image histograms.

Load images via large_image API

Easy setup of DSA stack using vagrant

@brianhelba can we assign this to someone? This will allow everyone to use the same development environment.

Alpha expansions w/ graph coloring - implementation

Refinement of label images using multi-label pygco graph cuts (alpha-expansions or alpha-beta swap). Graph coloring is used to reduce the computational burden so that # objects < # cells.

SparseColorDeconvolution output intensities are inverted

Refactor histomicstk into submodules and correct naming

Issues include moving functions into submodules, converting function names from snake-case to camel-case, correcting function names in "See Also" document strings, and adding underscore prefixes to non-exposed functions.

Fix handling negative findings in MinimumModel contour seeding

Optimize calls to scipy.ndimage.measurements.find_objects

Avoid unnecessary computation by calling with masks for select objects.

Optimize memory in label image functions

Examine in-place versus copy operations.

Bump girder, girder_worker and large_image in HistomicsTK/.travis.yml

This issue is about bumping hash of girder, girder_worker and large_image to the latest commit in HistomicsTK/.travis.yml

Convert function outputs to named tuples

MaxClustering - fix seeding

Deal with cases where there are multiple connected seed pixels in (Response == Max) - typically encountered where Response is flat.

Create an API for getting tiles of whole-slide images

A good way to do this is by creating an abstract WholeSlideImage class which defines the API and then make derived classes of it such as GirderWholeSlideImage (which serves an image stored on girder) and LocalWholeSlideImage (which serves an image located on the local machine).

We would want all analysis functions in histomicstk to take objects of the abstract WholeSlideImage class as input instead of a numpy array or PIL image.

Write code to automatically create REST routes for slicer execution model CLIs

Fix issue with ConvertSchedule.py and 40X slides

ConvertSchedule is generating bad size conversions for some slides, causing Sample.py to throw errors.

Max clustering - implementation

Implement a CLI that inputs a wholeslide image, segments nuclei, and generates features

EmbedBounds shape check incorrectly checks image planes

Shape check should only confirm that inputs have same number of rows/columns.

Sample.py produces empty output

On images:
TCGA-HW-8322-01Z-00-DX1.B6F32F6B-7FA3-42E4-98CB-F2F55208F7C4.svs
TCGA-HT-7479-01Z-00-DX1.E310974D-E52C-4634-8E16-2A41E8C37D45.svs
TCGA-DU-8165-01Z-00-DX1.8d633ff1-6aed-41e5-8518-8856dbe4a718.svs
TCGA-DU-8166-01Z-00-DX1.82951397-8c63-4c0f-8696-f081b170e21f.svs

File "/opt/lib/python2.7/site-packages/histomicstk-0.1.0-py2.7.egg/histomicstk/Sample.py", line 119, in Sample
Pixels = np.concatenate(Pixels, 1)

Remove condensing from label processing functions

Functions for processing label images automatically condense the label image values to fill gaps for simplicity. This breaks the correspondence between the values of objects before and after calling the function. This can be fixed by skipping over 'empty' labels in the input.

RAGLayer error when number of objects <= 2

Optimization fails in SimpleMask.py

On image TCGA-S9-A6UA-01Z-00-DX1.99AB8786-858B-46A0-94BA-C5AA1CD5351B.svs

File "/opt/lib/python2.7/site-packages/histomicstk-0.1.0-py2.7.egg/histomicstk/SimpleMask.py", line 55, in SimpleMask
TissuePeak = Peaks[yHist[Peaks[1:]].argmax()+1] #take highest peak among remaining peaks as background
ValueError: attempt to get argmax of an empty sequence

Rendering of annotations in HistomicsTK front-end

This issue is about adding the functionality necessary for rendering annotations on top of images in the HistomicsTK front-end.

@manthey and @jbeezley. Can one of you please take this up?

You can use some dummy annotations of each type to build the functionality necessary to render them.

Add types missing in girder_worker to support translation from slicer CLIs

Boundary and feature formats

Define the formats to be produced by object segmentation and feature extraction.

Boundaries - will need a function to convert a list of 2 x N arrays (x,y) to some format for consumption into Girder.

Features - how to link the feature names to the rows of a K x N array. Would these be consumed into Girder or kept as arrays on disk?