veg / bcell-phylo Goto Github PK
View Code? Open in Web Editor NEWTake antibody sequence data from JSON to FASTA format
Take antibody sequence data from JSON to FASTA format
Some of the germline sequences are very long, causing the resulting profile alignments to contain thousands of uninformative gaps at the beginning of the sequence. This inflates file size, slowing down loading. For instance, V-30-3
does not contain any non-germline characters until site 7967 (see below), and is 16MB. The largest I've seen thus far 189MB.
A solution could be as simple as running the gap trimmer on the profile aligments.
The output for rule: unaligned_to_amino_acids, has to be muted for the different rearrangements or else it the rule kicks an error. Need to figure out how to use the wildcards to pick up all the rearrangements for each germline gene
It will be much simpler and more comprehensive to annotate from IgSCUEAL output. The portion of pipeline that uses BLAST should be removed.
Add the ability to toggle amino acid vs. nucleotide alignments.
At the moment, IgSCUEAL was ran to create the initial starting data for the pipeline. This is not good for reproducibility, and explicit invocations will help efforts to productize these tools for Galaxy.
rule gene_unaligned_fasta: need to change the wildcards to loop over the files instead of listing them all out in the rule
the regex in full_fafsta_to_v-gene_separate_fasta.py needs to be modified so that it only grabs germline genes 1-6 and NOT 7
the README on the master branch needs to be updated to reflect our current pipeline
reduce the environment file to only the dependencies that we actually need. it is a waste of time to read in all the dependencies
The data directory can become a bit polluted after running several analyses. It would be nice to have some sort of script to clean it, so that it reflects it's state at the beginning of the pipeline.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.