Giter VIP home page Giter VIP logo

coverchild_fhir_etl's Introduction

CoverCHILD data integration FHIR ETL

last updated: 2023-11-30

Purpose

This repository contains code for the CoverCHILD data integration project. Patient data is queried from a FHIR server on-site and transformed into flat tables corresponding to FHIR resources for further anonymisation, processing, and analysis (e.g., the monitoring dashboard use case).

Dependencies

This script uses R (≥ 4.1.0), the following R packages and their respective dependencies:

Missing R packages are installed automatically from the R package repository (CRAN). If that is not wanted or possible, install packages manually prior to running the script. Please note that the script does try to install missing packages, but does not yet check whether versions of already installed packages are matching.

Folder structure

├── code/
│   ├── fhir_etl.R                   # main R script
│   └── functions.R                  # R helper functions
├── config/
│   ├── EXAMPLE_fhir_cfg.yml         # general configuration TEMPLATE
│   ├── EXAMPLE_fhir_search_cfg.yml  # FHIR search configuration TEMPLATE
│   ├── fhir_cfg.yml                 # general configuration, generated by running 
│   │                                # 'create_fresh_config.sh' or by copying and renaming 
│   │                                # 'EXAMPLE_fhir_cfg.yml' manually
│   └── fhir_search_cfg.yml          # FHIR search configuration, generated by running 
│                                    # 'create_fresh_config.sh' or by copying & renaming 
│                                    # 'EXAMPLE_fhir_search_cfg.yml' manually
├── logs/                            # log files (timings, http errors)
├── output/                          # final output of the script
├── tmp/                             # temporary files
│
├── CoverCHILD_FHIR_ETL.Rproj        # RStudio project file for running the script interactively
├── create_fresh_config.sh           # creates/resets configuration files by copying from TEMPLATES
└── run_fhir_etl.sh                  # runs the script (fhir_etl.R) while logging output

Steps for running the script

1) Configuration

  • Run 'create_fresh_config.sh' to create the two necessary configuration files from the templates in the 'config/' directory, or copy & rename them manually to 'fhir_cfg.yml' and 'fhir_search_cfg.yml' as shown in the folder structure
  • configure 'config/fhir_cfg.yml': server settings and general behaviour of the script.
  • configure 'config/fhir_search_cfg.yml': FHIR search parameters and resource element selection. This file only needs to be modified in special cases:
    • Make sure that all elements of the filter statements are present on the FHIR server and comment out not supported filter elements, for example if patients' addresses are censored.
    • If the FHIR server supports a custom 'ServiceType' SearchParameter for Encounter resources, uncomment the respective statement to enable leaner queries.

For further information and instructions, see the documentation within the configuration files.

2) Executing the script

  • in an interactive R session by opening the 'CoverCHILD_FHIR_ETL.Rproj' R project and running the 'code/fhir_etl.R' script

or by

  • running 'run_fhir_etl.sh'. Here, all output will be logged to folder specified for log files in 'config/fhir_cfg.yml'

Default filter criteria

This script queries all Patient, Condition, Procedure, and Observation resources belonging to Encounters which:

  • have admission dates between 2016-01-01 and 2022-03-31
  • have contact to the pediatrics or child and adolescence psychiatry departments
  • are under 18 years of age at admission
  • have a German address

Filter criteria can be inspected and modified in 'config/fhir_search_cfg.yml'.

Check for success / Troubleshooting

If the script ran through successfully

  • the last entry of the corresponding 'FHIR_timings_*.csv' in the log directory is 'Run FHIR ETL.'
  • the output directory contains one file per resource i.e., Patient, Encounter, Condition, Procedure, Observation (if 'save_output' was not set to null in 'config/fhir_cfg.yml')
  • no http error file was generated in the log directory

FAQ

  • will be filled accompanying the test phase
  • could not find function "..." -> package version outdated

Contact

We're very interested in your experience running the script and would be happy to receive any feedback regarding comments, troubleshooting, questions, improvements, etc.
Especially useful to us is feedback on performance i.e.,

  • the generated 'FHIR_timings_*.csv' log files, which do not contain sensitive data
  • optimal/feasible batch size configuration on the used hardware

Please feel free to message us on github, open issues, or write a mail to [email protected].


Published under CC-BY-4.0.

coverchild_fhir_etl's People

Contributors

simeonplatte avatar achiocch avatar

Forkers

weberch-ukl

coverchild_fhir_etl's Issues

Cannot join PatientID with subject reference

Hello,

we are currently trying to execute the pipeline for the CoverChild project. While executing we get the following error (see screenshot):

CoverChildError1

Do you have a guess where the reason for the error could lie?

Thank you for your help!

Best,

Peter, AIIM @ TUM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.