drivendataorg / concept-to-clinic Goto Github PK
View Code? Open in Web Editor NEWALCF Concept to Clinic Challenge
Home Page: https://concepttoclinic.drivendata.org/
License: MIT License
ALCF Concept to Clinic Challenge
Home Page: https://concepttoclinic.drivendata.org/
License: MIT License
When trying to see the Travis logs of a build, you end up on a Travis page stating "We couldn't find the repository concept-to-clinic/concept-to-clinic".
When trying to access the Travis logs of a build, you should be able to see the logs of the respective build.
You see the described error page.
Are the builds only visible to admins? If so, it might be helpful to change some access rights.
Check a build like this one
I want to see any problems that occurred when running a Travis build.
Contributors should be able to access the Travis logs so they know what went wrong.
local.py
configuration and ensured that I built the containers again and they reflect the most recent version of the project at the HEAD
commit on the master
branchA "breadcrumb trail" is a type of secondary navigation scheme that reveals the user's location in a website or Web application. For example, "Open image -> Detect and select -> ...". We should add this for mnavigation based on the current stage in the analysis to aid the end-user
The current page should be highlighted:
The links do not need to be actual hyperlinks, merely text.
We should avoid DRY (Don't Repeat Yourself) violations by duplicating the HTML on every possible page with slightly different markup. The links do not need to be clickable.
Issues: #9
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Pierre Fillard (Therapixel) |
rank | 5 |
repo | https://github.com/pfillard/tpx-kaggle-dsb2017 |
trained models | https://github.com/pfillard/tpx-kaggle-dsb2017/tree/master/models |
converted branch | |
ML engine | |
engine-version | |
ML backend | Tensorflow |
backend-version | 1.1 |
training method | |
architecture | |
algorithm | |
OS | Ubuntu |
OS version | 16.04 |
Python version | 3.5 |
CUDA version | 8 |
cuDNN version | 5.1 |
notes | https://github.com/pfillard/tensorflow/tree/r1.0_relu1 |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Git hooks are useful tools that run commands before you commit
We should have a default githook script and instructions on how to install the git hook.
.githook-pre-commit
file with the script to run the tests (tests/test-docker.sh
) and flake 8 on both of the codebases.NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Julian de Wit |
rank | 2 |
repo | https://github.com/juliandewit/kaggle_ndsb2017 |
trained models | https://retinopaty.blob.core.windows.net/ndsb3/trained_models.rar |
converted branch | |
ML engine | Keras |
engine-version | |
ML backend | Tensorflow |
backend-version | |
training method | |
architecture | |
algorithm | |
OS | Windows |
OS version | |
Python version | 3.5 |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | DL Munich |
rank | 7 |
repo | https://github.com/NDKoehler/DataScienceBowl2017_7th_place |
trained models | requested |
converted branch | |
ML engine | |
engine-version | |
ML backend | Tensorflow |
backend-version | 1.0.1 |
training method | |
architecture | |
algorithm | |
OS | Ubuntu |
OS version | 14.04 |
Python version | |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
When viewing the documentation through https://concepttoclinic.drivendata.org/documentation, external links should take the browser to a new webpage.
Clicking external links (at least in Firefox and Chrome) leads to a page with the "Concept to Clinic" heading that is otherwise blank.
It looks like the documentation page uses an iframe to display a readthedocs page. I think this iframe might need to be reconfigured to allow redirecting the target of its parent window. Since concepttoclinic.drivendata.org isn't controlled by this public github repo, I think an admin will have to fix this.
The project structure documentation is a useful first step, but could get much better. The first way to do that is to turn the references to the technologies into useful links to those technologies.
Current page: https://concept-to-clinic.readthedocs.io/en/latest/project-structure.html
The updated Project-structure page should link out from at least the following references to the project pages or documentation that is relevant for that reference:
docs/project-structure.md
fileNOTE: All PRs must follow the standard PR checklist.
When browsing for an test case to analyse, it would be very helpful for the end-user to be able to preview a particular case before proceeding to the next step. This would also provide means to ensure and double-check they are selecting the correct patient.
We should therefore display metadata about an image prior to selection.
The image metadata should be displayed in a panel on the right-hand side of the interface. For example:
Upon clicking on an image file, the data will have to be loaded. A preview of the image will need to be displayed too, likely in a seperate HTTP call, but it may be possible and cleaner to provide a preview within the same payload by base64-encoding the image data.
A pane on the right-hand side of the open image will reveal all the relevant metadata for an image.
Issues: #9
Currently, there is a just a placeholder in the algorithm that classifies nodules in scans. Nodules are areas of interest that might be cancerous. We need to adapt the Data Science Bowl algorithms to predict P(cancer)
for a given set of centroids for nodules.
Given a model trained to perform this task, a DICOM image, and the nodule centroids, return the P(cancer)
for each nodule.
Design doc reference:
Jobs to be done > Annotate > Prediction service
prediction/src/algorithms/classify/trained_model/predict
method.prediction/src/algorithms/classify/src/
folderprediction/src/algorithms/classify/assets/
folder using git-lfs
This feature is a first-pass at getting a model that completes the task with the defined input and output. We are not yet judging the model based on its accuracy or computational performance.
NOTE: All PRs must follow the standard PR checklist.
The "developer documentation" under Getting Started doesn't link to the "Local development with Docker" section.
The link should point to https://concept-to-clinic.readthedocs.io/en/latest/developing-locally-docker.html
The link points to the main documentation page https://concepttoclinic.drivendata.org/documentation
Change the link or reword the text so that it is clear where the current link leads to.
visit https://github.com/concept-to-clinic/concept-to-clinic#getting-started
click the "developer documentation" link
In order to help the end-user get a quick overview of the scale of the case, we should show a "X candidates found" message on the "Detect and select" step of the identification/analysis.
For example, "7 candidates found" as displayed here:
If the backend does not provide summary num_candidates
metadata, the value can be calculated by counting the number of candidates returned by the API. The test displayed should be pluralised, so "1 candidate found" vs "2 candidates", etc. in a reasonably clean manner.
The "detect and select" page shows the number of candidates found via a "X candidates found" message.
Issues: #9
Interface should include the current Git version in the top-right corner. This is in order to ease reporting of issues; any screenshots will implicitly include the version, potentially saving wasted effort when debugging possibly-fixed bugs, etc.
In the interface, the truncated SHA of the Git version used to build the site should be displayed, eg. b1f2ad46
.
It should also ensure it works in production where the .git/
metadata directory will not exist, thus some change to the deployment scripts may be required to capture this data.. Care should be taken re. caching to not inflict a performance penalty on every page load.
The top-right corner will display the SHA-1.
Issues: #9
Currently test images are stored in the git repository. These files won't change much so we should try to save bandwidth and separate concerns by tracking them with git-lfs
instead.
DICOM images in tests/assets
are managed with git-lfs
git-lfs
NOTE: All PRs must follow the standard PR checklist.
Currently, there is a just a placeholder in the algorithm that segments nodules in scans. Nodules are areas of interest that might be cancerous. We need to adapt the Data Science Bowl algorithms to predict nodule boundaries and descriptive statistics from an iterator of nodule centroids for an image.
Given a model trained to perform this task, a DICOM image, and an iterator of nodule centroids, save a file with boundaries (3D boolean mask with true values for voxels associated with that nodule), widest width, and volume to disk. Yield paths to the saved file for each nodule.
Design doc reference:
Jobs to be done > Segment > Prediction service
prediction/src/algorithms/segment/trained_model/predict
method.prediction/src/algorithms/segment/src/
folderprediction/src/algorithms/segment/assets/
folder using git-lfs
This feature is a first-pass at getting a model that completes the task with the defined input and output. We are not yet judging the model based on its accuracy or computational performance.
NOTE: All PRs must follow the standard PR checklist.
When docs are built (e.g. with make html
run from docs
directory), the prediction and interface projects should have their docstrings converted to entries in the documentation.
code.rst
document that has its contents derived automatically from docstrings — don't worry about breaking this up into smaller files yettoctree
(table of contents) in index.rst
You don't need to add to or fix existing docstrings as long as there is at least one that shows the autodoc is working.
make html
run from docs
directoryprediction
and interface
pieces of the project can get built using autodoc when the appropriate entry is put into code.rst
code.rst
demonstrates this is workingNOTE: All PRs must follow the standard PR checklist.
Future documentation of detection algorithms (Issues #18, #19, #20, #21, #22, #23, #24, #25, #26, #27 and #28) should have a consistent structure and make it easy to compare the different algorithms.
The issues mentioned above ask for documentation of algorithms from the Data Science Bowl. If addressed by several people, the documentation of each algorithm inconsistent and messy, making it unnecessarily harder to read and compare algorithms. Thus the advantage over just using the original documentation would be minimal.
Creating a template-file specifying sections and content to be filled in as much as possible. This issue thread is also thought of a place to discuss, which information about the algorithm should be included into the documentation.
Add a template_algorithms.md
file containing the template to the docs/template
-folder.
Whilst initially any summary will be fairly bare, by adding a summary all of the data from a case into a single JSON report early-on in the project we can be sure that any data will/can be exported for completeness reasons, etc.
It will also likely help in debugging and development of backend components as it will avoid making manual queries to the database via SQL commands or via a Python shell such as IPython.
We should be able to including the general notes, and details for each nodule, as well as all data about a case into a single JSON report.
For now, this can be a simple view that takes a Case
data, generates the corresponding dict
structure and pretty-prints using the pprint
module within a HTML page.
The generation of the dict
should be separate from the display so it can be reused and tested.
The summary can be downloaded and the code used to generate it has at least one simple testcase to ensure lack of obvious regressions.
Before more data is added to this view, we will start by simply showing the list of candidate nodules that are returned by the backend. Therefore, "Detect and select" views should show list of candidate nodules.
No accept/reject support is required here; that will be covered elsewhere.
The list of nodule candidates should be displayed on the "Detect and select" stage of case investigation:
An "accordion" interface is suggested. The predicted centroid metadata should be displayed as well, with any floating point numbers suitably truncated for human consumption.
The view called "Detect and select" will show the list of candidates corresponding to the current case and any relevant metadata (eg. centroids, etc.) are displayed.
Issues: #9
We want to be able to use .vue
files, ES6, and so forth.
As a first pass, NdagiStanley/vue-django has a lot of what we want. At a minimum, we want to be able to build .vue files so that we can use ✨ ES6 :sparkles and have our project JS compiled, including .vue
files.
Design doc reference:
vue-django
example is that we have two devservers (I think?) -- one from Django's manage.py
and one from the JS setup..vue
(and other JS) assets, the output of which can be served by the Django processNOTE: All PRs must follow the standard PR checklist.
Currently, there is a just a placeholder in the algorithm that classifies nodules in scans. Nodules are areas of interest that might be cancerous. We need to adapt the Data Science Bowl (DSB) algorithms to predict P(cancer)
given an iterator of nodule centroids for an image.
The top DSB algorithm (grt123) was written to run on a GPU for Python2. It would be nice to integrate this algorithm into the current structure and update it to run on Python3 (potentially on a CPU as well).
Given the grt123 model trained to perform this task, a DICOM image, and an iterator of nodule centroids, yield the P(cancer)
for each nodule.
Design doc reference: Detect and select
The majority of the Python3 and CPU conversion has been completed and is available in the conversion branch.
The forked model is available here (reads source DICOM images from S3).
One area that definitely needs review is the Py2/Py3 floor/true division conversions. Some calculation explicitly converted numbers to floats, and in those cases it was apparent that true division was desired. However, the remaining floor division calculations should be checked to ensure that true is not the appropriate operation.
If you get a UnicodeDecodeError
error while trying to load the serialized Torch model, use the torch_loader
function in the utils
module instead.
When running on the CPU, it isn't necessary to perform that much work. Just enough to obtain a plausible result in a reasonable amount of time.
This feature should be implemented in the prediction/classify/trained_model/predict
method.
NOTE: All PRs must follow the standard PR checklist.
We need to adapt the Data Science Bowl algorithms to produce possible centroid locations for nodules within an image rather than just P(cancer)
for the whole image.
Currently, there is a just a placeholder in the algorithm that identifies nodules in scans. Nodules are areas of interest that might be cancerous (or might not be, the goal here is just the potentially concerning areas). This must actually yield centroid locations of potential nodules (X voxels from left, Y voxels from top, Z slice number).
First we need to train a model to perform this task. Then, we need to serialize the model so that it can be loaded from disk and used to make predictions. This trained model should be added to the prediction/src/algorithms/identify/assets/
folder using git-lfs
. Finally, we need to write the code in the predict
method that will load the model from assets, take in a DICOM image, and yield nodule locations in the specified format.
Design doc reference:
Jobs to be done > Detect and select > Prediction service
prediction/src/algorithms/identify/trained_model/predict
methodprediction/src/algorithms/identify/src/
folderprediction/src/algorithms/identify/assets/
folderThis feature is a first-pass at getting a model that completes the task with the defined input and output. We are not yet judging the model based on its accuracy or computational performance.
NOTE: All PRs must follow the standard PR checklist.
Even though the backend pieces have yet to be written, it will be helpful to get a mocked out frontend version quickly. Small issues exist for many of the individual UI pieces, but first the overall layout has to be created.
We want DRY templates, so we need a base template that gives the rough Bootstrap layout (e.g. navbar, breadcrumbs, main div, left pane, main pane).
Closing this issue shouldn't implement the individual UI pieces.
In fact, many pieces of these pages will end up being refactored into individual Vue components in order to do dynamic front-endy things, so it's most important at this point to lay out some semantically correct HTML5 that others can build upon.
Design doc reference:
[1] Open imagery
[2] Detect and select
[3],[4] Annotate and segment
[5] Report and export
interface/frontend/index.html
as a starting pointhref="#"
for nowNOTE: All PRs must follow the standard PR checklist.
We need documentation on how to update the documentation! Bootstrap this for us!
Current page: https://concept-to-clinic.readthedocs.io/en/latest/project-structure.html
The documentation is a sphinx project. We need documentation on how to edit, structure, build locally, and test the documentation for participants who aren't familiar with sphinx.
NOTE: All PRs must follow the standard PR checklist.
Currently test images are stored in the git repository. These images are formatted as directories of DICOM files. Each directory contains hundreds of files, and consumes ~100MB of space. We would like to reduce the number of files to the bare minimum needed to detect nodules and pass the tests.
The number of DICOM directories and images is reduced to the minimum amount necessary (1 - 3 directories, each with 5 - 10 DICOM files).
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Deep Breath |
rank | 9 |
repo | https://github.com/EliasVansteenkiste/dsb3 |
trained models | not available |
converted branch | |
ML engine | Lasagne |
engine-version | 0.2.dev1 |
ML backend | Theano |
backend-version | 0.9.0b1 |
training method | CNN |
architecture | inception |
algorithm | Resnet |
OS | Ubuntu |
OS version | 16.04 |
Python version | 2.7 |
CUDA version | 8 |
cuDNN version | 5.1 |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
We need to adapt the Data Science Bowl algorithms to produce possible centroid locations for nodules within an image rather than just P(cancer)
for the whole image.
Currently, we just catch all exceptions and ignore them. We want to return as part of the json payload some useful information about what went wrong. For example, if the request didn't have the right parameters, we say what was missing. If the DICOM image could not be found, we say the file was not where we expected it.
prediction/src/views.py
fileexcept ExceptionType
blocks for different kinds of expected errors. For example: try:
...
except ImportError as ie:
# one error message
except ValueError as e:
# a different error message
We don't need to catch every possible exception. This issue is to identify how some common errors might occur and then return useful error messages.
NOTE: All PRs must follow the standard PR checklist.
After cloning using Git (and without LFS), after 30 commits and without test data, the repository should only be at most a few MB in size.
It's over 140 MB when cloning the master.
I guess that the reason is that in the beginning even big files were pushed using Git. When they were then transferred to Git LFS, they weren't removed from the "normal" Git history.
Would it be possible to remove those big files from the history using references like Removing sensitive data from a repository or How to remove/delete a large file from commit history in Git repository?? Unfortunately, I'm not that familiar with LFS or history rewriting...
local.py
configuration and ensured that I built the containers again and they reflect the most recent version of the project at the HEAD
commit on the master
branchSome nodules will be identified as non-candidates by the end user. Therefore, we should provde the ability to accept/reject nodule candidates on the "Detect and select" view.
"Dismiss" and "Mark concerning" buttons should be persisted by the user:
The next nodule candidate displayed until there are none left.
The user's decision should be sent back to the backend immediately upon selection for persistence, rather than aggregating the entire page's results first.
Issues: #9
A useful API will tell you how to use it. Update the API endpoints to be more helpful!
Currently, there is just a placeholder if you GET
our prediction API endpoints. We expect this call to tell us what the endpoint does, what the required parameters are, and what it returns.
To do this, we'll need to update our response payload to have that information.
prediction/src/views.py
description
, expected_parameters
, return_values
prediction/src/tests/test_endpoints.py
NOTE: All PRs must follow the standard PR checklist.
The first step of the analysis and identification process is to select an image file from the local disk. We should therefore show the user all the possible files available.
Upon loading the application, we should show a directory tree of the potential files:
The files should be displayed in a tree, ideally, using the Django storage
framework. In the development environement this should be a directory within our project but ignored via a suitable .gitignore
file.
No file format or name filtering is required; we can assume all files in the specified directory are valid. Sorting may need to be applied to the result of Django's listdir
for deterministic ordering.
All files are listed in their correct hierarchy, sorted alphabetically.
Issues: #9
There are a handful of large data sources for this project. We want to use git-lfs
to manage those large files so they are not all in the repository. We need the documentation so that all the contributors understand how to use git-lfs
.
At least the model asset folders (e.g., prediction/src/algorithms/classify/assets
) and the test assets folder (tests/assets) have large files in them. Model assets are already tracked by git-lfs
for people who have it installed (see .gitattributes).
Our documentation should enable people to:
git-lfs
git-lfs
with this projectgit-lfs
and remove those files from the repogit-lfs
(for example, to save on bandwidth) and pull the repo without the large filesdocs
Helpful links:
https://git-lfs.github.com/
https://help.github.com/articles/about-git-large-file-storage/
NOTE: All PRs must follow the standard PR checklist.
Add ability to set right/left lung for candidate nodule
Annotation and segmentation view should support ability to save which right/left lung the nodule corresponds to; before we pass back any more detailed or complex information, we should simply provide a way to return this information so that adding further data in future is easier.
It should be possible to select and subsequently save the per-nodule lung orientation using the dropdown:
Note that only the left/right lung selection is part of this issue.
The dropdown should not save until an "Accept" button has been pressed; see the design document for an example wireframe.
Per-nodule lung orientation should be returned and persisted to the backend service, once "Accept" is pressed.
Issues: #9
The tests are currently failing on the master
branch.
We should endeavour to keep the testsuite passing at all times.
$ ./manage.py test
Creating test database for alias 'default'...
System check identified no issues (0 silenced).
..F...
======================================================================
FAIL: test_landing (backend.static.tests.SmokeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/lamby/git/work/drivendata/challenge-application-test/interface/backend/static/tests.py", line 9, in test_landing
self.assertContains(resp, 'Hello')
File "/home/lamby/git/work/drivendata/challenge-application-test/interface/.venv/lib/python2.7/site-packages/django/test/testcases.py", line 393, in assertContains
self.assertTrue(real_count != 0, msg_prefix + "Couldn't find %s in response" % text_repr)
AssertionError: Couldn't find 'Hello' in response
----------------------------------------------------------------------
Ran 6 tests in 0.049s
FAILED (failures=1)
Destroying test database for alias 'default'...
The testsuite passes.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Daniel Hammack |
rank | 1 |
repo | https://github.com/dhammack/DSB2017 & https://github.com/juliandewit/kaggle_ndsb2017 |
trained models | https://retinopaty.blob.core.windows.net/ndsb3/trained_models.rar |
converted branch | |
ML engine | Keras |
engine-version | |
ML backend | Theano |
backend-version | |
training method | |
architecture | |
algorithm | |
OS | Windows |
OS version | 64bit |
Python version | 2.7 & 3.5 |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
The interface backend has some model stubs for tracking metadata about DICOM imagery. The Case
model tracks a radiologist's workflow examining an image series, so it's the heart of this application. But the Case
has foreign keys to the ImageSeries
model, which needs to know about the data.
We should be able to pass a directory URI to this method (helpful ref here) and expect that an ImageSeries
object should be created if necessary, with metadata fields filled in.
ImageSeries
class, possibly as a classmethod
; this could be like:
uri = 'file:///path/to/project/tests/assets/LIDC-IDRI-0001/1.3.6.1.4.1.14519.5.2.1.6279.6001.298806137288633453246975630178/1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636438705059720603192'
ImageSeries.get_or_create(uri)
NOTE: All PRs must follow the standard PR checklist.
A few links in the Getting Started section of the documentation are broken (the 2nd and 3rd bullets).
The links found at
should point to valid urls.
NOTE: All PRs must follow the standard PR checklist.
All of our models need to take a path to a DICOM image (which is actually a directory of images and XML files) and then load that image into memory.
The function should take a path to a DICOM directory and load the data from that directory into a format that will be useful to the models. It will then provide For example DICOM-numpy may be useful here.
This issue is for a first pass implementation. As the models evolve, we may need to update and change the format that this method provides to its callers.
prediction/src/preprocess
folderNOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Alex |Andre |Gilberto |Shize |
rank | 8 |
repo | https://github.com/astoc/kaggle_dsb2017 |
trained models | https://github.com/astoc/kaggle_dsb2017/tree/master/code/Andre/nodule_identifiers |
converted branch | |
ML engine | Keras |
engine-version | 1.2.2 |
ML backend | Theano |
backend-version | |
training method | |
architecture | |
algorithm | |
OS | |
OS version | |
Python version | |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | qfpxfd |
rank | 4 |
repo | http://www.cis.pku.edu.cn/faculty/vision/wangliwei/software.html |
trained models | not available |
converted branch | |
ML engine | Keras |
engine-version | |
ML backend | Tensorflow |
backend-version | |
training method | CNN |
architecture | 3D VGG |
algorithm | |
OS | |
OS version | |
Python version | |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | Owkin Team |
rank | 10 |
repo | https://github.com/owkin/DSB2017 |
trained models | https://github.com/owkin/DSB2017/blob/master/sje_scripts/Unet_X.hdf5 |
trained models | https://github.com/owkin/DSB2017/blob/master/sje_scripts/Unet_Y.hdf5 |
trained models | https://github.com/owkin/DSB2017/blob/master/sje_scripts/Unet_Z.hdf5 |
trained models | https://github.com/owkin/DSB2017/blob/master/pic_scripts/model64x64x64_v5_rotate_v2.h5 |
trained models | xgboost trees/features not provided |
converted branch | |
ML engine | Keras |
engine-version | 2 |
ML backend | Tensorflow |
backend-version | 1 |
training method | |
architecture | |
algorithm | |
OS | |
OS version | |
Python version | |
CUDA version | |
cuDNN version | |
notes | http://pyradiomics.readthedocs.io/en/latest/installation.html |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Via #65 (comment)
@reiinakano wrote:
Also, I can't seem to see the Travis CI log. This would be helpful so I can see what tests failed (if there are any). Since this is a public repo anyway, why not make the CI logs public as well?
Visit https://concept-to-clinic.readthedocs.io/en/latest/design-doc.html#prediction-service and you will notice that the 3rd level headings are the same in each section, this prevents you from navigating to any section other than the 1st.
Clicking on a heading (e.g., Interface API) should go to the appropriate section
You can only navigate within the first section
Use different heading names, or the :ref:
directive
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
rank | 3 |
repo | https://bitbucket.org/aidence/kaggle-data-science-bowl-2017 |
trained models | requested |
converted branch | |
ML engine | |
engine-version | |
ML backend | Tensorflow |
backend-version | 1.1 |
training method | |
architecture | |
algorithm | Resnet |
OS | |
OS version | |
Python version | 3.4 |
CUDA version | |
cuDNN version | |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | grt123 |
rank | 1 |
repo | https://github.com/lfz/DSB2017 |
trained models | https://github.com/lfz/DSB2017/tree/master/model |
converted branch | https://github.com/concept-to-clinic/DSB2017 |
ML engine | pytorch |
engine-version | 0.1.10+ac9245a |
ML backend | |
backend-version | |
training method | |
architecture | |
algorithm | |
OS | Ubuntu |
OS version | 14.04 |
Python version | 2.7 |
CUDA version | 8 |
cuDNN version | 5.1 |
notes |
docs folder
NOTE: All PRs must follow the standard PR checklist.
We want to provide clinicians with summary statistics about identified tumors. In the segment
model, we output per-pixel binary masks that identify which pixels are likely to be cancer. From these masks, we want to calculate the volume of the tumor.
Take as an input the path to the serialized masks (which is the output of the segment predict
method). We also have the centroids that are passed into the predict
method. For each centroid, calculate the volume of the tumor. DICOM has slice sizes so the units should be in real measurements (not pixels).
Design doc reference:
Jobs to be done > Segment
trained_model.py
file.#XX is the work to add this output to the API endpoint.
NOTE: All PRs must follow the standard PR checklist.
Once we produce summary statistics for tumors, those need to be returned to the frontend through the API.
Currently, the segment
endpoint only returns a path to a file which has a binary mask of the DICOM image. This is useful for displaying the nodules, but we also want to report summary statistics. This issue tracks taking both the summary statistic calculations and the path to the binary masks generated by segment.trained_model.predict
and surfacing both of those pieces of information through the API.
Design doc reference:
Jobs to be done > Detect and select > Prediction service
Depends on #13, which tracks actually calculating the volumes. It's acceptable to create a stub for that work and implement this independently. You may need to read #13 to fully understand this issue.
NOTE: All PRs must follow the standard PR checklist.
Participants in the Data Science Bowl produced several algorithms that we would like to incorporate. To help facilitate this effort, we also want to add documentation so that contributors can make an educated decision when selecting an algorithm to incorporate.
This documentation should enable people to:
Design doc reference: Detect and select
key | value |
---|---|
team | MDai |
rank | 6 |
repo | https://github.com/mdai/kaggle-lung-cancer |
trained models | requested |
converted branch | |
ML engine | Keras |
engine-version | 1.2.2 |
ML backend | Tensorflow |
backend-version | 1.0.0 |
training method | |
architecture | |
algorithm | |
OS | Ubuntu |
OS version | 16.04 |
Python version | 3.5 |
CUDA version | 8 |
cuDNN version | 5.1 |
notes | https://github.com/pydicom/pydicom/tree/bbaa74e9d02596afc03b924fe8ffbe7b95b6ff55 |
docs folder
NOTE: All PRs must follow the standard PR checklist.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.