marleysudbury / detect-diagnose-tumours Goto Github PK
View Code? Open in Web Editor NEWFinal year project. Analysis and diagnosis of histology images of tumours.
Home Page: https://marleysudbury.github.io/final-year-project
Final year project. Analysis and diagnosis of histology images of tumours.
Home Page: https://marleysudbury.github.io/final-year-project
To evaluate the learning of the model, an annotation should be generated which can be compared with hand annotation provided by a pathologist.
The code right now is a mess with lots of redundancy. To make everything clearer I will be transforming it to be object oriented.
This error occurs many times while attempting to normalise patches. Perhaps it could be patches that are completely blank? In any case, many patches are being missed due to this error.
As a pre-processing step / part of the image processing pipeline, the stains on WSIs should be normalised so that any variation does not affect the training and classification processes.
The training data (H&N5000) has been cropped to only include the important areas. I believe this was done automatically by the microscope scanning the slides. The evaluate dataset (Cam16) is not cropped at all. This means that the model receives a lot less detail from the evaluation dataset, causing it to be less effective in diagnosis.
This could be resolved by cropping the images based on the annotation data available, or by switching to a patch based approach where this doesn't matter.
Both modules cannot be imported at the same time. This is a problem now that I am doing the patch-based approach, and using Openslide to do it. I will work around it for the moment by fudging the code, but perhaps Pyvips can be completely replaced with Openslide, which would have the benefit of cutting down on dependencies.
Currently all experiments have been using a simple model provided in the Tensorflow documentation. This will be referred to for now as the 'first model'. Other models to try:
These should be separated into different files which can be imported by the trainer and classify files. This will hopefully also reduce code duplication.
If the model is trained on the compressed H&N5000 data set, it can achieve an acceptable level of accuracy. However, this accuracy does not transfer to the Cam16 dataset. Possible causes:
To use Pyvips for the image pipeline requires the Libvips binaries to be present on the computer. These binary files are available for download on Windows, but on Linux they need to be installed with a packages manager, e.g.:
sudo apt-get install libvips42
The problem with this when it comes to my project is that I'm executing the code on lab PCs for which I don't have admin rights. I have tried copying across the files from my Linux machine to the lab machine, but there are so many dependencies that doing this manually is impractical and a waste of time (I have already wasted 3 or 4 days).
For now, either I will need to continue with a parallel work flow, or I can ask IT Support if it's possible to install the package. They will probably say no.
Using the annotations on the data, it should be possible to go through each slide by a small region at a time. If that region is within the annotated area, the annotation label is applied to the patch. If it outside or on the border of an annotated area, it could be discarded or put into a neutral class.
If these patches are then exported and used to train a neural network, a new slide could be analysed by cutting it into patches and running each one through the neural network. This approach has been shown to be effective by previous research.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.