Giter VIP home page Giter VIP logo

commercial-data-healthcare-predictions's Introduction

Value of Commercial Product Sales Data in Healthcare Prediction

NHSX Analytics Unit - PhD Internship Project

About the Project

This repository holds code for the NHSX Analytics Unit PhD internship project investigating the use of model class reliance to identify the value of including commerical sales data in respiratory death predictions by Elizabeth Dolan.

Project Description - Value of Commercial Product Sales Data in Healthcare Prediction

Note: No data, public or private are shared in this repository.

Project Stucture

  • The main code is found in the root of the repository (see Usage below for more information)
  • The accompanying report is also available in the reports folder. Results and Discussion can be read as a full pre-print via https://www.researchsquare.com/article/rs-2226531/v1.
  • The Python libraries needed are listed in the requirements document. Please take note, you will need to go to https://github.com/gavin-s-smith/mcrforest to install the packages for MCR (Model Class Reliance). You may need to install numpy and Cython before the mcrforest will install. You will also need to install sci-kit learn version 0.24.2 in order to run the code "from sklearn.model_selection import TimeSeriesSplit" . This TimeSeriesSplit version has the correct parameters to ensure no data leakage in the time series cross validation.

Built With

Python v3.8

Getting Started

Installation

To get a local copy up and running follow these simple steps.

To clone the repo:

git clone https://github.com/nhsx/commercial-data-healthcare-predictions.git

To create a suitable environment:

  • python -m venv env or virtualenv -p /path/to/required/python/version .venv
  • source .venv/bin/activate
  • (may need to) pip install numpy & pip install Cython
  • pip install git+https://github.com/gavin-s-smith/mcrforest
  • pip install -r requirements.txt

You may need to install pyscopg2 (https://www.psycopg.org/docs/install.html) which in turn can require gcc and additions to your PATH (https://stackoverflow.com/questions/5420789/how-to-install-psycopg2-with-pip-on-python).

Caveats for Apple Macbook Pro M1 users: Sklearn will not install using usual methods, the installation errors citing a build dependencies issue. Use this line to install; pip3 install -U --no-use-pep517 scikit-learn==0.24.2 More information on this see scikit-learn/scikit-learn#19137

Usage

Note: In it's current form this repoistory has been shared with fake data to allow the codes to run. This data is randomly sampled from the same metadata features as the data but bears no resemblance to the ground truth data.

run Create_op_rf_for_mcr.py to create a set of models to predict registered deaths from respiratory disease. These models used commercial sales data and a wide range of other variables, which have shown associations with deaths from respiratory disease.

run MCR_for_op_rf.py to create explanations for the models by identifying the different impact variables inputted have on the models’ predictions, including commercial sales data. This code implements the novel variable importance tool MCR for random forest regressor.

Dataset

Experiments are run against the:

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

See CONTRIBUTING.md for detailed guidance.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

To find out more about the Analytics Unit visit our project website or get in touch at [email protected].

Acknowledgements

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.