marleysudbury / detect-diagnose-tumours Goto Github PK

View Code? Open in Web Editor NEW

3.0 2.0 1.0 6.27 MB

Final year project. Analysis and diagnosis of histology images of tumours.

Home Page: https://marleysudbury.github.io/final-year-project

Python 100.00%

histology machine-learning python tensorflow

detect-diagnose-tumours's People

Contributors

Stargazers

Watchers

Forkers

nishan-charlie

detect-diagnose-tumours's Issues

Output annotation for ImageScope and/or QuPath

To evaluate the learning of the model, an annotation should be generated which can be compared with hand annotation provided by a pathologist.

Code refactor

The code right now is a mess with lots of redundancy. To make everything clearer I will be transforming it to be object oriented.

LinAlgError: Eigenvalues did not converge

This error occurs many times while attempting to normalise patches. Perhaps it could be patches that are completely blank? In any case, many patches are being missed due to this error.

H&E stain normalisation

As a pre-processing step / part of the image processing pipeline, the stains on WSIs should be normalised so that any variation does not affect the training and classification processes.

The training data (H&N5000) has been cropped to only include the important areas. I believe this was done automatically by the microscope scanning the slides. The evaluate dataset (Cam16) is not cropped at all. This means that the model receives a lot less detail from the evaluation dataset, causing it to be less effective in diagnosis.

This could be resolved by cropping the images based on the annotation data available, or by switching to a patch based approach where this doesn't matter.

Pyvips and Openslide are incompatible

Both modules cannot be imported at the same time. This is a problem now that I am doing the patch-based approach, and using Openslide to do it. I will work around it for the moment by fudging the code, but perhaps Pyvips can be completely replaced with Openslide, which would have the benefit of cutting down on dependencies.

Construct and separate different models

Currently all experiments have been using a simple model provided in the Tensorflow documentation. This will be referred to for now as the 'first model'. Other models to try:

The one used in https://arxiv.org/abs/1702.05931
https://arxiv.org/abs/1409.1556

These should be separated into different files which can be imported by the trainer and classify files. This will hopefully also reduce code duplication.

Model not able to classify images from another data set

If the model is trained on the compressed H&N5000 data set, it can achieve an acceptable level of accuracy. However, this accuracy does not transfer to the Cam16 dataset. Possible causes:

The dataset have a different order of images within the files
The model has too small of an input layer
The stains need to be normalised (#2)

Pyvips precompiled binaries not available for Linux

To use Pyvips for the image pipeline requires the Libvips binaries to be present on the computer. These binary files are available for download on Windows, but on Linux they need to be installed with a packages manager, e.g.:

sudo apt-get install libvips42

The problem with this when it comes to my project is that I'm executing the code on lab PCs for which I don't have admin rights. I have tried copying across the files from my Linux machine to the lab machine, but there are so many dependencies that doing this manually is impractical and a waste of time (I have already wasted 3 or 4 days).

For now, either I will need to continue with a parallel work flow, or I can ask IT Support if it's possible to install the package. They will probably say no.

Implement patch-based learning and classification

Using the annotations on the data, it should be possible to go through each slide by a small region at a time. If that region is within the annotated area, the annotation label is applied to the patch. If it outside or on the border of an annotated area, it could be discarded or put into a neutral class.

If these patches are then exported and used to train a neural network, a new slide could be analysed by cutting it into patches and running each one through the neural network. This approach has been shown to be effective by previous research.

marleysudbury / detect-diagnose-tumours Goto Github PK

detect-diagnose-tumours's People

Contributors

Stargazers

Watchers

Forkers

detect-diagnose-tumours's Issues

Output annotation for ImageScope and/or QuPath

Code refactor

LinAlgError: Eigenvalues did not converge

H&E stain normalisation

Cam16 dataset not cropped

Pyvips and Openslide are incompatible

Construct and separate different models

Model not able to classify images from another data set

Pyvips precompiled binaries not available for Linux

Implement patch-based learning and classification

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent