Giter VIP home page Giter VIP logo

histoqc's Introduction

HistoQC

HistoQC is an open-source quality control tool for digital pathology slides

screenshot

Requirements

Tested with Python 3.7 and 3.8 Note: the DockerFile installs Python 3.8, so if your goal is reproducibility you may want to take this into account

Requires:

  1. openslide

And the following additional python package:

  1. python-openslide
  2. matplotlib
  3. numpy
  4. scipy
  5. skimage
  6. sklearn
  7. pytest (optional)

You can likely install the python requirements using something like (note python 3+ requirement):

pip3 install -r requirements.txt

The library versions have been pegged to the current validated ones. Later versions are likely to work but may not allow for cross-site/version reproducibility (typically a bad thing in quality control).

Openslide binaries will have to be installed separately as per individual o/s instructions

The most basic docker image can be created with the included (7-line) Dockerfile.

Installation

Using docker

Docker is now the recommended method for installing and running HistoQC. Containerized runtimes like docker are more portable and avoid issues with python environment management, and ensure reproducible application behavior. Docker is available for Windows, MacOS, and Linux.

Note: These instructions assume you have docker engine installed on your system. If you do not have docker installed, please see the docker installation instructions.

  1. Begin by pulling the official HistoQC docker image from docker hub. This repository contains the latest stable version of HistoQC and is guaranteed up-to-date.

    docker pull histotools/histoqc:master
  2. Next, run the docker image with a few options to mount your data directory and expose the web interface on your host machine.

    docker run -v <local-path>:/data --name <container-name> -p <local-port>:5000 -it histotools/histoqc:master /bin/bash
    # Example:
    # docker run -v /local/datadir:/data --name my_container -p 5000:5000 -it histotools/histoqc:master /bin/bash
  3. A terminal session will open inside the docker container. You can now run HistoQC as you would on a local machine.

  4. If you exit the shell, the container will stop running but no data/configuration will be lost. You can restart the container and resume your work with the following command:

    docker start -i <container-name>
    # Example:
    # docker start -i my_container

Using pip

You can install HistoQC into your system by using

git clone https://github.com/choosehappy/HistoQC.git
cd HistoQC
python -m pip install --upgrade pip  # (optional) upgrade pip to newest version
pip install -r requirements.txt      # (required) install pinned versions of packages
pip install .                        # (recommended) install HistoQC as a package

Note that pip install . will install HistoQC as a python package in your environment. If you do not want to install HistoQC as a package, you will only be able to run HistoQC from the HistoQC directory.

Basic Usage

histoqc CLI

Running the pipeline is now done via a python module:

C:\Research\code\HistoQC>python -m histoqc --help
usage: __main__.py [-h] [-o OUTDIR] [-p BASEPATH] [-c CONFIG] [-f] [-b BATCH]
                   [-n NPROCESSES] [--symlink TARGET_DIR]
                   input_pattern [input_pattern ...]

positional arguments:
  input_pattern         input filename pattern (try: *.svs or
                        target_path/*.svs ), or tsv file containing list of
                        files to analyze

optional arguments:
  -h, --help            show this help message and exit
  -o OUTDIR, --outdir OUTDIR
                        outputdir, default ./histoqc_output_YYMMDD-hhmmss
  -p BASEPATH, --basepath BASEPATH
                        base path to add to file names, helps when producing
                        data using existing output file as input
  -c CONFIG, --config CONFIG
                        config file to use
  -f, --force           force overwriting of existing files
  -b BATCH, --batch BATCH
                        break results file into subsets of this size
  -s SEED, --seed SEED,
                        set a seed used to produce a random number in all modules                    
  -n NPROCESSES, --nprocesses NPROCESSES
                        number of processes to launch
  --symlink TARGET_DIR  create symlink to outdir in TARGET_DIR

Installed or simply git-cloned, a typical command line for running the tool thus looks like:

python -m histoqc -c v2.1 -n 3 "*.svs"

which will use 3 process to operate on all svs files using the named configuration file config_v2.1.ini from the config directory.

In case of errors, HistoQC can be run with the same output directory and will begin where it left off, identifying completed images by the presence of an existing directory.

histoqc.config CLI

Supplied configuration files can be viewed and modified like so:


C:\Research\code\HistoQC>python -m histoqc.config --help
usage: __main__.py [-h] [--list] [--show NAME]

show example config

optional arguments:
  -h, --help   show this help message and exit
  --list       list available configs
  --show NAME  show named example config

Alternatively one can specify their own modified config file using an absolute or relative filename:

python -m histoqc.config --show light > mylight.ini
python -m histoqc -c ./mylight.ini -n 3 "*.svs"

histoqc.ui CLI

HistoQC now has a httpd server which allows for improved result viewing, it can be accessed like so:

C:\Research\code\HistoQC>python -m histoqc.ui --help
usage: histoqc.ui [-h] [--port PORT] resultsfilepath

launch server for result viewing in user interface

positional arguments:
  resultsfilepath       Specify the full path to the results file. The user must specify this path.

optional arguments:
  -h, --help            show this help message and exit
  --port PORT, -p PORT  Specify the port [default:5000]

After completion of slide processing, view results in your web-browser by running the following command:

python -m histoqc.ui <results-file-path>
# Example:
# python -m histoqc.ui ./histoqc_output_YYMMDD-hhmmss/results.tsv

Note: The results file is a tab-separated file generated by HistoQC containing the quality control metrics for each slide. HistoQC generates the results file in the output directory specified by the -o flag, or formatted as histoqc_output_YYMMDD-hhmmss by default.

You may then navigate to http://<hostname>:5000 in your web browser to view the results.

Configuration modifications

HistoQC's performance is significantly improved if you select an appropriate configuration file as a starting point and modify it to suit your specific use case.

If you would like to see a list of provided config files to start you off, you can type

python -m histoqc.config --list

and then you can select one and write it to file like so for your modification and tuning:

python -m histoqc.config --show ihc > myconfig_ihc.ini

Advanced Usage

See wiki

Notes

Information from HistoQC users appears below:

  1. the new Pannoramic 1000 scanner, objective-magnification is given as 20, when a 20x objective lense and a 2x aperture boost is used, i.e. image magnification is actually 40x. While their own CaseViewer somehow determines that a boost exists and ends up with 40x when objective-magnification in Slidedat.ini is at 20, openslide and bioformats give 20x.

1.1. When converted to svs by CaseViewer, the MPP entry in ImageDescription meta-parameter give the average of the x and y mpp. Both values are slightly different for the new P1000 and can be found in meta-parameters of svs as tiff.XResolution and YResolution inverse values, so have to be converted, also respecting ResolutionUnit as centimeter or inch

Citation

If you find this software useful, please drop me a line and/or consider citing it:

"HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides", Janowczyk A., Zuo R., Gilmore H., Feldman M., Madabhushi A., JCO Clinical Cancer Informatics, 2019

Manuscript available here

“Assessment of a computerized quantitative quality control tool for kidney whole slide image biopsies”, Chen Y., Zee J., Smith A., Jayapandian C., Hodgin J., Howell D., Palmer M., Thomas D., Cassol C., Farris A., Perkinson K., Madabhushi A., Barisoni L., Janowczyk A., Journal of Pathology, 2020

Manuscript available here

histoqc's People

Contributors

adamjtaylor avatar ap-- avatar birm avatar bswhite avatar cgdogan avatar choosehappy avatar cielal avatar dependabot[bot] avatar foggydae avatar jacksonjacobs1 avatar jjhbw avatar kaczmarj avatar nanli-emory avatar ntomita avatar petrovm3 avatar pjl54 avatar satishev avatar skrackow avatar tasvora avatar volodymyrchapman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

histoqc's Issues

UI: Move overlay image to side


Author Name: Andrew Janowczyk (@choosehappy)
Original Redmine Issue: 85, http://hawking.case.edu:3000/issues/85
Original Date: 2018-01-03


Comment from Mario:

!screenshot_2_1515001688.png!

for figure 5, this placement was a result of my javascript skills. i don't think its so terrible because either clicking the image again or pressing the "escape" button closes the image so that you can again see the graph, so its perhaps not as bad as you thought.

Yes, it also didn't bother me too much when you gave the live demo the other day. But wanted to point it out anyway.

UI: add tag functionality


Author Name: Andrew Janowczyk (@choosehappy)
Original Redmine Issue: 115, http://hawking.case.edu:3000/issues/115
Original Date: 2018-01-15
Original Assignee: Ren Zuo


can be comma separated in a separate field (to be provided by back end, likely named "tag")

click "add tag", and any tag added is appended to the already existing list of tags for that image

this comes into play when multi selecting - > add tag - > "too blurry"

then sorting differently, multiselecting- > add tag - > "folded tissue"

any of the images which were in both selected lists should have both tags

bonus functionality: provide list of suggested tags (e.g., when starting to type, show tags already existing) to limit tag "wander"

UI: Filename paths all relative or absolute


Author Name: Andrew Janowczyk (@choosehappy)
Original Redmine Issue: 84, http://hawking.case.edu:3000/issues/84
Original Date: 2018-01-03


Currently in the UI there are a few places where the filenames appear as both relative, aboslute, or "hidden" path names

should be homogenized, preferably to something which allows the user to rapidly find the files via copy->paste (e.g., absolute)

absolute has the downside wherein its a bit tricky if the directory is moved, without special attention, the files won't be found (e.g., can't save abosltue filenames in csv file)

Parse out all of Slide Metadata


Author Name: Andrew Janowczyk (@choosehappy)
Original Redmine Issue: 89, http://hawking.case.edu:3000/issues/89
Original Date: 2018-01-03


and present in individual columns

this may become tricky because each manufacturer has different metadata information

also, has the downside of crowding the image, would need to put better "sorting" in the column orders, so that the important values aren't hidden all the way to the right

Break sets up


Author Name: Andrew Janowczyk (@choosehappy)
Original Redmine Issue: 118, http://hawking.case.edu:3000/issues/118
Original Date: 2018-01-17


command line parameter specifying the maximum number of samples per output file

this will help the front end display data and the user review data in coherent chunks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.