Giter VIP home page Giter VIP logo

bcell-phylo's People

Contributors

jzehr avatar stephenshank avatar stevenweaver avatar

Watchers

 avatar  avatar

bcell-phylo's Issues

Long germline sequences

Some of the germline sequences are very long, causing the resulting profile alignments to contain thousands of uninformative gaps at the beginning of the sequence. This inflates file size, slowing down loading. For instance, V-30-3 does not contain any non-germline characters until site 7967 (see below), and is 16MB. The largest I've seen thus far 189MB.

A solution could be as simple as running the gap trimmer on the profile aligments.

image

Snakemake issue #2

The output for rule: unaligned_to_amino_acids, has to be muted for the different rearrangements or else it the rule kicks an error. Need to figure out how to use the wildcards to pick up all the rearrangements for each germline gene

Make IgSCUEAL an explicit step in the pipeline

At the moment, IgSCUEAL was ran to create the initial starting data for the pipeline. This is not good for reproducibility, and explicit invocations will help efforts to productize these tools for Galaxy.

SnakeMake file

rule gene_unaligned_fasta: need to change the wildcards to loop over the files instead of listing them all out in the rule

regex

the regex in full_fafsta_to_v-gene_separate_fasta.py needs to be modified so that it only grabs germline genes 1-6 and NOT 7

README

the README on the master branch needs to be updated to reflect our current pipeline

environment file

reduce the environment file to only the dependencies that we actually need. it is a waste of time to read in all the dependencies

Make a script to clean data directory

The data directory can become a bit polluted after running several analyses. It would be nice to have some sort of script to clean it, so that it reflects it's state at the beginning of the pipeline.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.