Giter VIP home page Giter VIP logo

cgs-1's Introduction

cgs-apps

Genomics apps for accessing data, importing data, visualizing data, etc.

This package is part of the CGS project.

This package provides apps that are plugins for the UI Hue. his module also contains user interfaces that are available through the web interface.

A description of the apps available in Hue are described below.

Installation

To install a new app, download the zip file, decompress it and run from the decompressed directory sudo python installCGSapps.py <appname1> <appname2>.

Apps

The variants app

The variants app aims to deal with variants. Here is a list of functionalities of this app:

  • Importing VCF files into a NoSQL database: the VCF file is converted to a JSON, which is then converted to an AVRO file that is stored, the AVRO file is then loaded into HBase.
  • Querying variants from Impala: a Hive Metastore is built upon HBase that can be then requested through Impala.
  • Importing patient/sample information: a web interface allows the user to register information about a sample.

Those functionalities are available through the Hue interface and through external client using API. For more details about the API provided in this app, see here.

The variants app - Install

For now, you need to go through additional steps to run this app (not included in installCGSapps.py yet):

  1. sudo python installCGSapps.py variants
  2. sudo easy_install pip
  3. sudo pip install ordereddict (if you have python < 2.7)
  4. sudo pip install counter (if you have python < 2.7)
  5. sudo pip install pyvcf
  6. sudo pip install djangorestframework==3.2.5
  7. sudo pip install markdown
  8. sudo pip install django-filter
  9. Allow the app to create temporary files (only for dev): chmod -R 777 /usr/lib/bin/hue
  10. Initialize the database by going to: http://quickstart.cloudera:8888/variants/database/initialize/

If you have problems with hue permissions, or that installCGSapps.py does not seem to restart the views.py after you modified it, you can try the following command in your virtual machine (not recommended in production, just for debug) find /usr/lib/hue -type d -exec chmod 777 {} \;

HBase might not be available when you resume your VM, thus CGS will not work correctly. In that case:

  • sudo service hbase-master restart
  • sudo service hbase-regionserver restart

The variants app - Importing data

For now the import of vcf is using the local disk of the node where Hue is installed, because of that be careful to have enough free space on your hard drive for files larger than 10Go.

  • Upload your vcf through the Hue interface inside your user directory
  • Go to cgs/sample
  • Select your vcf file then click on "Import directly"
  • The import is in progress, and the data will be available soon through Impala/Hive and HBase

Note: For now, only one import of vcf file per user will work (do not launch simultaneous imports). It will be improved in later versions.

The variants app - Querying data

CGS implements almost the same interface to data as Google Genomics. Thanks to that, you can use their documentation. Only the 'variants' section is supported yet.

  • Accessing a single variant: http://quickstart.cloudera:8888/variants/api/variants/ For example: http://quickstart.cloudera:8888/variants/api/variants/ulb|0|1|10177|A/
  • Looking through variants like Google Genomics (see doc to structure your request correctly) is accessible through a POST query at http://quickstart.cloudera:8888/variants/api/variants/search/. If you do not submit any field, you can modify directly the code in api.py at VariantDetail to be able to test easily through a GET query (for dev only).
  • Highlander has a dedicated access to query data according to its table structure through http://quickstart.cloudera:8888/variants/api/variants/highlander_search/. It only supports a very limited range of SELECT queries and it will not always return the same data as it would do for the Highlander table. For example a select count(*) will count the number of variants in CGS, but in Highlander it would count the number of calls. To modify the behavior of the queries, contribution from Highlander developers is needed.
    • The data sent to CGS should be a POST with the following fields:
      • method: "SELECT"
      • fields: "field1, field2, ..." or "count(*)"
      • condition: "field1 = value1 AND field2 != value2 ..."
      • limit: integer
      • offset: integer
      • order-by: "field" (mandatory if an offset > 0 is given)

App 2

*There is no other app yet ... please do not hesitate if you want to contribute.

cgs-1's People

Contributors

gilles-degols avatar jpoullet2000 avatar khushbukp avatar yannael avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.