Giter VIP home page Giter VIP logo

coordinates_conversion's Introduction

coordinates_conversion

Build Status

Conversion programs that use the output from fasta_diff to convert reference sequence IDs and coordinates in Gff3, bam, bed, or bedgraph file formats. Main contributors are Han Lin (original development) and interns of i5k workspace.

Prerequisite

  • Python 2.7
  • samtools (optional, only for SAM/BAM related scripts)

Installation

pip install git+https://github.com/NAL-i5K/coordinates_conversion.git

Features

Scripts to convert reference sequence IDs and coordinates in different file formats.

Quick start

  1. Run fasta_diff
  • Compares two very similar FASTA files and outputs coordinate mappings using a multi stage algorithm:

  • Stage 1: Find 100% matches

  • Stage 2: Find 100% substrings, where the full length of a new sequence can be found as a substring of a oldsequence

  • Stage 3: Find cases where part of the sequence was converted into Ns

  • Stage 4: Find cases where a old sequence is split into two or more new sequences

  • Outputs (match.tsv) the 6 columns as tab-separated values: old_id, old_start, old_end, new_id, new_start, new_end

    fasta_diff example_file/old.fa example_file/new.fa -o match.tsv -r report.txt

  1. Select a conversion script that matches your file format
  1. Run conversion script:
  • update_gff

    update_gff -a match.tsv example_file/example1.gff3 example_file/example2.gff3

  • update_bam

    • samtools needs to be installed before running this program:

    • If you have a bam file without a corresponding index file (.bai), you can generate one using:

      samtools index example_file/example.bam

    • Then use update_bam to convert your bam files

      update_bam -a match.tsv example_file/example.bam

    • update_bed

      update_bed -a match.tsv example_file/example.bed

    • update_bedgraph

      update_bedgraph -a match.tsv example_file/example.bedGraph

    • update_vcf

      update_vcf -a match.tsv example_file/example.vcf

coordinates_conversion's People

Contributors

dytk2134 avatar mpoelchau avatar hsiaoyi0504 avatar childers avatar tony006469 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.