icebert / pblat-cluster Goto Github PK
View Code? Open in Web Editor NEWblat with cluster parallel hybrid computing support
Home Page: http://icebert.github.io/pblat-cluster/
License: Other
blat with cluster parallel hybrid computing support
Home Page: http://icebert.github.io/pblat-cluster/
License: Other
Hello Meng,
Great tool!! The cluster version is quick and accurate.
I have no issues using the cluster version against the genome file size of around 6MB and reads file size of 800MB with 149 read sequences. It does the analysis in about 3-4 hours.
However, I now have a genome file of size around 30MB and reads file of size around 1GB (with 49 read sequences). I left it to run for a day or so and it's still running. Sometimes it just crashes. The files in question can be found in the links below:
https://www.mediafire.com/file/v5sgvajfnweeih7/genome.fa/file
https://www.mediafire.com/file/64jiwm4cgn0bbuq/reads.fa/file
Could you please advice as to what would be the ideal ratio of nodes/cores and memory to get this analyzed as fast as possible. It still doesn't finish even if I increase the cores/nodes and memory.
I use the following sbatch script:
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --mem=14G
#SBATCH --time=06:00:0
module load nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.4
mpirun ./pblat-cluster genome.fa reads.fa -out=pslx out.pslx
The above script works well on the 6MB genome and 800MB reads files.
Thanks,
Vijay
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.