Giter VIP home page Giter VIP logo

matchminer-curate's Introduction

OncoKB Core

Repository for OncoKB, a precision oncology knowledge base.

The core of OncoKB Annotation service.

Status

Application CI Unit Tests Release Management Sentrey Release

Info

Running Environment

Please confirm your running environment is:

  • Java version: 8
  • MySQL version: 5.7.28

Prepare properties files

cp -r core/src/main/resources/properties-EXAMPLE core/src/main/resources/properties

Properties file

  1. database.properties
    • jdbc.driverClassName : We use mysql as database. Here, it will be com.mysql.jdbc.Driver
    • jdbc.url: Database url
    • jdbc.username & jdbc.password: MySQL user name and password
  2. config.properties

Build the WAR file

mvn clean install -P public -DskipTests=true

The WAR file is under /web/target/

Deploy with frontend

Please choose one of the profile when building the war file

  • curate - core + API + curation website
  • public - core + API + public website (deprecated)

You could find specific instructions in curate or public repo,

Run with Docker containers

OncoKB™ is a precision oncology knowledge base developed at Memorial Sloan Kettering Cancer Center that contains biological and clinical information about genomic alterations in cancer. OncoKB uses Genome Nexus to annotate genomic change to protein change using OncoKB picked transcripts. By default, the API requests are sent to www.genomenexus.org for GRCh37 and grch38.genomenexus.org for GRCh38. However, you can choose to use a local version of Genome Nexus by following the instructions for Option A, otherwise follow instructions for Option B.

OncoKB docker compose file consists of the following services:

  • OncoKB: provides variant annotations

  • OncoKB Transcript: serves OncoKB metadata including gene, transcript, sequence, etc.

  • Genome Nexus: provides annotation and interpretation of genetic variants in cancer

    • GRCh37 (optional):
      • gn-spring-boot: the backend service responsible for aggregating variant annotations from various sources
      • gn-mongo: variants fetched from external resources and small static data are cached in the MongoDB database
      • gn-vep: is a spring boot REST wrapper service for VEP using GRCh37 data
    • GRCh38 (optional):
      • gn-spring-boot-grch38: same as gn-spring-boot service, however the VEP URL points to gn-vep-grch38
      • gn-mongo-grch38: contains static data relevant to GRCh38
      • gn-vep-grch38: a spring boot REST wrapper service for VEP using GRCh38 data

Option A: With Local installation of Genome Nexus

For this option, you need to download the VEP cache, which is used in the gn-vep and gn-vep-grch38 services. We have pre-downloaded the VEP data and saved them to our AWS S3 Bucket. If interested, here are the instructions we followed to download the Genome Nexus VEP Cache.

  1. OncoKB requires a MySQL server and the oncokb and oncokb-transcript databases imported. This step must be completed before continuing the installation process. Reach out to [email protected] to get access to the data dump.

  2. Download the Genome Nexus VEP data from our AWS S3 Bucket.

    # The home directory is used to store the VEP cache in this tutorial, but this can be changed to your preferred download location.
    cd ~
    mkdir gn-vep-data && cd "$_"
    
    mkdir 98_GRCh37 && cd "$_"
    curl -o 98_GRCh37.tar https://oncokb.s3.amazonaws.com/gn-vep-data/98_GRCh37/98_GRCh37.tar
    tar xvf 98_GRCh37.tar
    
    cd ..
    mkdir 98_GRCh38 && cd "$_"
    curl -o 98_GRCh38.tar https://oncokb.s3.amazonaws.com/gn-vep-data/98_GRCh38/98_GRCh38.tar
    tar xvf 98_GRCh38.tar
    
  3. Set environment variable for the location of VEP caches

    # Update path if the VEP data was installed elsewhere
    export VEP_CACHE=~/gn-vep-data/98_GRCh37
    export VEP_GRCH38_CACHE=~/gn-vep-data/98_GRCh38
    
  4. Run docker-compose to create containers.

    docker-compose --profile genome-nexus up -d
    

    Note: The --profile argument is used as a way to selectively enable services. Services with the genome-nexus profile will only be spun up when the profile is specified.

Option B: Without local installation of Genome Nexus

  1. OncoKB requires a MySQL server and the oncokb and oncokb-transcript databases imported. This step must be completed before continuing the installation process. Reach out to [email protected] to get access to the data dump.
  2. Remove -Dgenome_nexus.grch37.url and -Dgenome_nexus.grch38.url properties from the oncokb service.
  3. Run docker-compose to spin up oncokb and oncokb-transcript services
    docker-compose up -d
    

Additional Information

Generating oncokb-transcript token

The docker compose file has a pre-generated oncokb-transcript JWT token, which is required to make API requests to the oncokb-transcript service. To generate the JWT token, go to the https://jwt.io/ website and follow these instructions:

  1. Add the auth key and set it to ROLE_ADMIN to grant roles. The payload section should look something like this:
    {
        "sub": "1234567890",
        "name": "John Doe",
        "auth":"ROLE_ADMIN",
        "iat": 1516239022
    }
    
  2. In the Verify Signature section, check the box secret base64 encoded. Copy and paste the oncokb-transcript base64 secret into the input box.
    • You can also change the default base64 secret used for encoding by generating a base64 string and add the environment variable, JHIPSTER_SECURITY_AUTHENTICATION_JWT_BASE64_SECRET: <new-base64-string>, to oncokb-transcript.
  3. Replace -Doncokb_transcript.token with the JWT token you generated.

Generating new VEP data

OncoKB predownloads VEP data and saves it to AWS S3 bucket. These steps are for OncoKB developers and show how to download and upload new Ensembl VEP data to S3. However, you can follow along and save VEP data to your own S3 bucket.

  1. Change Ensembl image in genome-nexus-vep Dockerfile to desired version
  2. Follow instructions to download VEP cache files and FASTA files for GRCh37 and GRCh38.
  3. After downloading your directory should like:
VEP_CACHE/
├─ homo_sapiens/
│  ├─ 98_GRCh37/
│  ├─ 98_GRCh38/
  1. Zip the files
tar cf 98_GRCh37.tar homo_sapiens/98_GRCh37
tar cf 98_GRCh38.tar homo_sapiens/98_GRCh38
  1. Go to AWS S3 webpage and under oncokb/gn-vep-data/, create two folders:
98_GRCh37/
98_GRCh38/
  1. Upload tar files to corresponding S3 folders
  2. Make the two S3 folders (oncokb/gn-vep-data/98_GRCh37/ and oncokb/gn-vep-data/98_GRCh38/) publicly accessible
  3. Update gn-vep and gn-vep-grch38 services in docker-compose.yml
Modify environment variable to point to the new FASTA file

gn-vep
VEP_FASTAFILERELATIVEPATH=homo_sapiens/98_GRCh37/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz

gn-vep-grch38
VEP_FASTAFILERELATIVEPATH=homo_sapiens/98_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
  1. Modify Dockerfile line in genome-nexus-vep to use the new Ensembl VEP image. As of 4/28/2023, genome-nexus-vep uses ensemblorg/ensembl-vep:release_98.3.
  2. Push new genome-nexus-vep image to DockerHub
  3. Change the image for both gn-vep and gn-vep-grch38 to the image built in step 7.

Questions?

The best way is to send an email to [email protected] so all our team members can help.

matchminer-curate's People

Contributors

ethansiegl avatar jiaojiao123 avatar jjgao avatar victoria34 avatar zhx828 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

matchminer-curate's Issues

Add a middle layer to convert customized interface data to original defined firebase format

The purpose of this feature is to make curation more flexible. Currently, we strictly stick to the YAML format defined by Match Miner in the curation interface. This caused lots of inconveniences, such as to enter age_input greater than 12 and less than 18, curator have to enter a "Or" section containing two Clinical Section.

Below are two nominated features:
a) Genomic Section
annotated_variant: allow curator to enter multiple variants separated by a comma, such as "V600E,V600K,V600M"

b) Clinical Section
age_input: allow curator to enter an age in the format of ">=12,<=18"

Store matching result in cache

Store mapping result like patient-trial? Or store result like patient-variant-trial? Or store mapping result like variant-trial?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.