Giter VIP home page Giter VIP logo

ebi_task's Introduction

EMBL-EBI TASK

Exercise

Using the latest human data from Ensembl release and the Perl API to convert coordinates on chromosome (e.g chromosome 10 from 25000 to 30000 ) to the same region in GRCh37. Enable the script to be as generic as possible to be run as a command-line program.

Specific Coordinate Conversion

I have created a python script specific_cooordinate_conversion.py which uses Ensembl REST API Endpoint to convert a specific coordinate from GRCh38 to GRCh37 assembly.

Input Format

Input for the python script is sequence_region_name:start..end:strand(optional).

Here, arguments - "Sequence Region Name, Start Position and End Position" are mandatory whereas "Strand" is optional.

For example, X:1000000..1000100:1 or X:1000000..1000100

Run the python script specific_cooordinate_conversion.py

Input the arguments as necessary

python3 specific_cooordinate_conversion.py X:1000000..1000100

Output format

Output will be of JSON format containing mapping of the the chromosome region in GRCh38 to that of the coordinates in GRCh37

{'mappings': 
  [
  {'original': {'end': 1000100, 'assembly': 'GRCh38', 'strand': 1, 'coord_system': 'chromosome', 'start': 1000000, 'seq_region_name': 'X'}, 
  'mapped': {'end': 960835, 'assembly': 'GRCh37', 'strand': 1, 'coord_system': 'chromosome', 'seq_region_name': 'X', 'start': 960735}
  }, 
  {'original': {'assembly': 'GRCh38', 'end': 1000100, 'start': 1000000, 'seq_region_name': 'X', 'strand': 1, 'coord_system': 'chromosome'}, 
  'mapped': {'coord_system': 'chromosome', 'strand': 1, 'seq_region_name': 'HG480_HG481_PATCH', 'start': 960735, 'end': 960835, 'assembly': 'GRCh37'}
  }
  ]
}

Coordinate Conversion for all regions

I have created a perl script convert_coordinates_GRCh38_GRCh37.pl which uses Ensembl REST API Endpoint to convert all the chromosome sequence regions from GRCh38 to that of the coordinates in GRCh37 assembly.

This script will convert all the coordinates of the chromosome and hence, there is no specific input.

Run the perl script convert_coordinates_GRCh38_GRCh37.pl

perl convert_coordinates_GRCh38_GRCh37.pl

Output

Output will be a JSON format file data_out.json containing all the mappings of the the chromosome regions in GRCh38 to that of the coordinates in GRCh37

ebi_task's People

Contributors

maheshkumarsundaram avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.