Giter VIP home page Giter VIP logo

synfo-ny-discharge's Introduction

NY SPARCS Hospital Discharge - Synthetic Data Generation

Welcome to the repository for synthetic data generation based on New York's SPARCS Hospital Discharge Records). Our goal here is to showcase how synthetic data can enhance the capabilities of machine learning models.

๐Ÿ“Œ Motivation

Synthetic data is a game-changer in the realm of machine learning. By generating high-quality synthetic data, we can train large-scale ML systems which, interestingly, can often perform more effectively on real-world data compared to models trained and tested solely on actual data. This approach minimizes overfitting, diversifies the training regime, and can provide robustness to the machine learning systems, making them more adaptable to a variety of scenarios.

๐Ÿ“Š Progress

Item Status
Synthetic Data Generation Models โœ”๏ธ Completed
Data Specification Paper โณ In Progress
Risk Prediction Paper โณ In Progress

๐Ÿ“˜ Contents

๐Ÿ“” Notebooks

This Jupyter notebook walks you through the initial steps in our data preprocessing pipeline. Here, we detail the process of cleaning the SPARCS Hospital Discharge Records, ensuring data quality and consistency.

In this notebook, we delve into encoding techniques tailored for the dataset. Encoding is crucial in preparing the data for synthetic data generation, ensuring that the generated data maintains properties similar to the original.

A comprehensive guide that takes you step-by-step through the process of synthetic data generation for the NY SPARCS data. Ideal for both beginners and experienced practitioners.

๐Ÿ“‚ Data Files

New York's SPARCS Hospital Discharge Record data from 2009 to 2021 can be obtained from the Health Data NY Catalog.

This text file contains ICD-9 to ICD-10 mappings, an essential tool for understanding the conversion between these medical coding systems.

Updated geographical data linked with the NY SPARCS records. This dataset provides details on Operating Certificate Numbers, assisting in the geographic location mapping of hospitals and care facilities.

๐Ÿ“œ License

This project is licensed, ensuring open source availability while protecting against unauthorized use or modification. Please take a look at the LICENSE file for a detailed description of terms and conditions.

๐Ÿ‘ฅ Contribution

Contributions are what make the open-source community such an inspiring place for discovery, learning, and innovation. We welcome your contributions! Whether it's fixing bugs, enhancing documentation, proposing new features, or helping out in any other way, every contribution counts.


New York Department of Health Data Use Policy

We thank the New York Department of Health for releasing these datasets. Note: The New York State Department of Health makes no representation, warranty or guarantee relating to the data or analyses derived from these data. For more information on the New York Department of Health Data Use Policy, please see their statement.

synfo-ny-discharge's People

Contributors

mattfaltyn avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.