Giter VIP home page Giter VIP logo

odsc_workshop's Introduction

Introduction to Clinical Natural Language Processing: Predicting Hospital Readmission with Discharge Summaries

Clinical notes from physicians and nurses contain a vast wealth of knowledge and insight that can be utilized for predictive models to improve patient care and hospital workflow. In this workshop, we will introduce a few Natural Language Processing techniques for building a machine learning model in Python with clinical notes.

As an example, we will focus on predicting unplanned hospital readmission with discharge summaries using the MIMIC III data set. After completing this tutorial, the audience will know how to prepare data for a machine learning project, preprocess unstructured notes using a bag-of-words approach, build a simple predictive model, assess the quality of the model and strategize how to improve the model.

Getting Started

Getting Access to MIMIC III

The MIMIC III data set requires requesting access in advance, so please request access as early as possible. You can follow step-by-step instructions for requesting access here (https://towardsdatascience.com/getting-access-to-mimic-iii-hospital-database-for-data-science-projects-791813feb735) or follow the instructions on the MIMIC III website here (https://mimic.physionet.org/gettingstarted/access/). Once you request access, it usually takes a few days to get approved so please do this as early as possible.

Once you have access, please visit https://physionet.org/works/MIMICIIIClinicalDatabase/files/ and download

  • ADMISSIONS.csv.gz
  • NOTEEVENTS.csv.gz

and place these files in a folder labeled 'data' in this repo.

Setting up local Python Environment

For this workshop, you will need the following Python packages:

  1. Python 3
  2. skikit-learn
  3. pandas
  4. numpy
  5. jupyter
  6. ipykernel
  7. nltk

Easiest way: Anaconda Distribution of Python

For those with the Anaconda Distribution, I have created a yml environment file (environment.yml) that can be used to create a virtual environment with the required information.

  1. $ conda env create -f environment.yml
  2. $ activate odsc_workshop
  3. $ python -m ipykernel install --user --name odsc_workshop

Run Jupyter Notebook

$ jupyter notebook

The main jupyter notebook we will use in this tutorial is odsc_mimic.ipynb. Due to time constraints, we will skip some of the preprocessing and cleaning of the MIMIC files. If you are interested in a notebook that steps through these steps see odsc_mimic_pre.ipynb. Note that these same steps were then added to odsc_utils.py.

odsc_workshop's People

Contributors

andrewwlong avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.