Giter VIP home page Giter VIP logo

slideslicer's Introduction

Background

The challenge of whole slide imaging is that the files are of huge size (~ 3e4 x 5e4 pixels, ~300MB), while the tissue often occupies less than a quarter of that area, especially in core biopsy slides.

Functionality

This package provides tools to sample and read slides and annotations together at different resolutions and locations.

This package comes with a set of scripts to

  1. sample tissue and specific tissue features and
  2. convert ROI outlines to masks and manipulate the masks.

The masks can be efficiently stored in run-length encoding MS-COCO format. This format dramatically compresses binary masks allowing to store them in JSON files, preserving original label in free text form.

These MS-COCO JSON masks can be converted to one-hot [height x width x classes] or sparse [height x width] format. As a rule we store them in sparse format in png files when needed.

Extras

Examples

Setup

Option A: Use a docker image:

docker pull dslituiev/slideslicer:latest # approx 2GB
docker run -it -p 8899:8899 dslituiev/slideslicer:latest # run docker with a jupyter notebook on port 8899

Option B: Native installation under Mac or Ubuntu/Debian:

Step 1. download and install openslide (a C library)

  • OPTION 1 (fast): use a package manager

    • on MacOS with brew

      # install openslide on MacOS
      brew install openslide
      
    • on Debian / Ubuntu

      sudo apt-get install openslide-tools
      
    • other platforms

  • OPTION 2 (slow but robust): build from source

    curl -LOk https://github.com/openslide/openslide/releases/download/v3.4.1/openslide-3.4.1.tar.gz
    tar xzvf openslide-3.4.1.tar.gz
    cd openslide-3.4.1
    ./configure && make && make install
    

Step 2. [optional] create and activate a conda environment

ENV_NAME='slsl'
conda create -y -n $ENV_NAME python=3.6 && source activate $ENV_NAME

Step 3. install the python package

# install dependencies
pip3 install cython
pip3 install numpy
# install slideslicer
pip3 install git+https://github.com/DSLituiev/slideslicer

Input data

Currently slideslicer is created to handle Aperio SVS + associated XML annotation files. Please feel free to raise an issue to request support or offer pull request for other formats

the input data comes as

  1. a whole slide image (WSI)
  2. ROI outlines file (in XML format -- currently Leica SVS style XML only)

Pipeline scripts

Use following command line tools for slicing multiple slides in command line:

# download SVS file from Google Cloud Storage and sample patches from it
pull_n_chop.sh

# subsample if needed
FACTOR=2 # produces 512x512
FACTOR=4 # produces 256x256 

DATADIR="/repos/data/data_1024/fullsplit/all"
subsample.py $DATADIR $FACTOR

# link inflammation vs everything else classes
# BASEDIR="/repos/data/data_1024/"
BASEDIR="/repos/data/data_128_subsample_8x/"
./link_binary_infl_norm.sh $BASEDIR

# split into train and test set
makesets.sh

# create sparse png masks from COCO JSON files
json_to_png_csv.py

slideslicer's People

Contributors

dslituiev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.