Giter VIP home page Giter VIP logo

lynnlangit / gcp-for-bioinformatics Goto Github PK

View Code? Open in Web Editor NEW
220.0 14.0 67.0 89.75 MB

GCP for Bioinformatics Researchers

Home Page: https://www.youtube.com/playlist?list=PL4Q4HssKcxYvcixWS08UFaYIH7y4IAV0z

License: Apache License 2.0

Shell 0.28% Jupyter Notebook 99.37% WDL 0.16% Nextflow 0.19%
gcp bioinformatics bioinformatics-pipeline bioinformatics-analysis genomics bioinformatics-researchers google nextflow google-batch

gcp-for-bioinformatics's Introduction

Google Cloud Platform (GCP) for Bioinformatics

This repository shows how to use Google Cloud Platform (GCP) public cloud services to scale sets of bioinformatics data analysis tasks. This Repo uses cloud best practices for GCP. All examples use genomic sample (input) data, tools and pipelines. Use cases included here as examples are called by any and all of the following terms:

  • genomic-scale data workflows or pipelines
  • bioinformatics primary, secondary or tertiary analysis
  • distributed cloud-based batch jobs

This content is intended for researchers - in particular, this guide is for those who are NEW to working with GCP. You have a number of options on how to use the materials provided in this course. A summary is shown below left.

This Repo includes content you can read, watch or run:

  • ๐Ÿ“— READ - one page of this Repo (MD page)
  • ๐Ÿ“บ WATCH - linked YouTube screencasts
  • ๐Ÿ“™ RUN - Jupyter Notebook example
  • :octocat: TRY - linked GitHub Repos
  • ๐Ÿ“˜ EXPAND - linked (external) resources
  • ๐Ÿ” SCAN - search a list in this Repo

NOTE: If you are looking for AWS guidance, see my 'aws-for-bioinformatics' Repo/Course at link


๐Ÿ“บ Click below to WATCH 'Lynn's Welcome Video' (4 min) on YouTube

Welcome to GCP for Bioinformatics


Why would I choose to use a public cloud vendor for bioinformatics?

โญ๏ธ SAVE MONEY run (and pay for) scalable analysis jobs only when you need to run them
โญ๏ธ SAVE TIME use vendor-managed infrastructure & best-practice patterns for fast repeatable research
๐Ÿ“— READ the FAQ for GCP bioinformatics for this Repo
๐Ÿ“• READ Nature article: "Cloud computing for genomic data analysis and collaboration"
๐Ÿ“— READ the top 4 most common use cases for using the public cloud for bioinformatics researchers

Bioinformatics wanting more advanced GCP content?

If you would like to learn more advanced concepts (including script examples and patterns) about working with Google Cloud Platform, see my Repo gcp-essentials --> link


New to Bioinformatics?

If you are NEW to bioinformatics and have a computational background...

  • :octocat: REVIEW my bioinformatics concepts tools and terms
    • Designed for experienced cloud practioners who are NEW to Bioinformatics
    • The 'student notes repo' is named Team Teri - link to 'who is Teri?'
    • This Repo includes links to explanations of bioinformatics concepts, tools and platforms - link

Contibutions

We love contributions! See this short style guide when making pull requests to this repo.


gcp-for-bioinformatics's People

Contributors

droidada avatar gitter-badger avatar lhagen-isb avatar lynnlangit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gcp-for-bioinformatics's Issues

4_Test_Container_for_Pipelines.md => constantly restarting docker image

I created Cloud Compute VM with registry.hub.docker.com/lynnlangit/blastn used for Container image in the Deploy image VM config options. All other VM config options were set to default.

When I ssh onto the VM and run docker ps, I see that the container is constantly restarting every ~1 minute. The same occurs if I use a number of different Docker images (public or from my GCP Artifact Registry). The VM logs do not provide any insight into why the docker image is restarting. I've tried explicitly setting the container command to /bin/bash or used custom containers with either and ENTRYPOINT or CMD set to /bin/bash. No matter why I try, the container constantly restarts in the VM.

If I change the restart policy to never, then docker ps shows nothing. So, it appears that the container is either just running /bin/bash and then ending (and then restarting, with the always container restart policy) or the container is failing without even running /bin/bash. Again, the VM logs do not provide useful info on what is going on.

It would be great to have some info in the docs about how to use a VM in a container (via specifying the container in the VM setup) and docker attach.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.