CS598_Project

Replicating "Design and implementation of a deep recurrent model for prediction of readmission in urgent care using electronic health records" (Zebin and Chaussalet, 2019)

Model Training and Evaluation

Note: To run this project, you will need a copy of the MIMIC III dataset located on the machine you wish to run. The MIMIC III dataset is publicly available through credentialed access on Physionet.org.

0. Installing Dependencies

This project requires:

Dependency	Version
python	3.11 or higher
matplotlib	3.7.1
numpy	1.24 or higher
p_tqdm	1.4.0
pandas	2.0 or higher
scikit-learn	1.2.2
torch	2.0.0
torchvision	2.0.0
openpyxl	3.1.2

Additionally, you will need to install Jupyter to run the interactive notebooks.

1. Subsampling the Data

If you already have a suitably sized subset of the MIMIC III dataset, you can skip this step.

By running the data_sampling/datasampler.py script, you can generate a subset of the full MIMIC III dataset that will more easily run on your machine. We used a sample size of 10000 ICU stays.

Note: you will need to edit this script with the correct path to the copy of the MIMIC III dataset that you downloaded on your machine.

Running the data sampler:

python3 data_sampling/datasampler.py

2. Preprocessing the Data

The data must be preprocessed and stored into binary files before it is used by the models. The following scripts from the data_cleaning directory should be run before launching the interactive notebooks:

python3 data_cleaning/preprocessing_mp.py
python3 data_cleaning/events_to_list.py

3. Running the Interactive Python Notebooks

The training of the various models is performed in Jupyter interactive notebooks. To run one of these models, launch the Jupyter server with file of the model you wish to train:

# Open LSTM+CNN Model interactive notebook
jupyter notebook source/lstm-cnn-model.ipynb

# Open LSTM Model baseline interactive notebook
jupyter notebook source/lstm-model.ipynb

# Open Logistic Model baseline interactive notebook
jupyter notebook source/logistic-model.ipynb

The results history of the models are stored as pickle binary files in the same directory.

Repository Organization

This project repository is organized into the following sub-folders:

data: binary working folders of preprocessed data (MIMIC-III data must be added according to instructions above)
demo_data: binary working folders of preprocessed data from publicly available demo data used for model development
data_sampling: data subset sampling source code
data_cleaning: data preprocessing source code
project_summary: tables and figures summarizing reproducibility results
source: interactive notebooks for each model and utility classes

Citation

T. Zebin and T. J. Chaussalet, "Design and implementation of a deep recurrent model for prediction of readmission in urgent care using electronic health records," 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Siena, Italy, 2019, pp. 1-5, doi: 10.1109/CIBCB.2019.8791466.

karlvosatka / lstm-cnn-mimic-3 Goto Github PK

lstm-cnn-mimic-3's Introduction

CS598_Project

Model Training and Evaluation

0. Installing Dependencies

1. Subsampling the Data

2. Preprocessing the Data

3. Running the Interactive Python Notebooks

Repository Organization

Citation

lstm-cnn-mimic-3's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent