Giter VIP home page Giter VIP logo

myvariant.info's Introduction

Introduction

This is a project coming out of 1st NoB Hackathon.

The scope of this project is to aggregate existing annotations for genetic variants. Variant annotations have drawn huge amount of efforts from researchers, which made many variant annotation resources available, but also very scattered. Doing integration of all of them is hard, so we want to create a simple way to pool them together first, with high-performance programmatic access. That way, the further integration (e.g. deduplication, deriving higher-level annotations, etc) can be much easier.

From the discussion of the hackathon, we decided a strategy summarized as below:

A very simple rule to aggregate variant annotations
  • each variant is represented as a JSON document
  • the only requirement of the JSON document is that the key of this JSON document ("_id" field in this document) follows HGVS nomenclature. For example:
     {
       '_id': 'chr1:g.35366C>T',
       'allele1': 'C',
       'allele2': 'T',
       'chrom': 'chr1',
       'chromEnd': 35367,
       'chromStart': 35366,
       'func': 'unknown',
       'rsid': 'rs71409357',
       'snpclass': 'single',
       'strand': '-'
     }
  • that way, we can then merge multiple annotations for the same variant into a merged JSON document. Each resource of annotations is under its own field. Here is a merged example.
A powerful query-engine to access/query aggregated annotations

The query engine we developed for MyGene.info can be easily adapted to provide the high-performance and flexible query interface for programmatic access. MyGene.info follows the same spirit, but for gene annotations. It currently serves ~3M request per month.

User contributions of variant annotations

User contribution is vital, given the scale of available (also increasing) resources. The simple rule we defined above makes the merging new annotation resource very easy, essentially writing a JSON importer. And the sophisticated query-engine we built can save users effort to build their own infrastructure, which provides the incentive for them to contribute.

Also note that it's not only the data-provider can write the importer, anyone who finds a useful resource can do that as well (of course, check to make sure the data release license allows that)

See the guideline below for contributing JSON importer.

How to contribute

See this How to contribute document.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.