lentendu / networknullhpc Goto Github PK
View Code? Open in Web Editor NEWOTU co-occurrence network inferrence base on null model for HPC
License: MIT License
OTU co-occurrence network inferrence base on null model for HPC
License: MIT License
For example -c chunk_size, default value: 1e5
A lower value would increase parallelisation with lower memory request for the edge step (currently 12G for 1 thread).
Need to adapt slurm memory request to this chunk size for the edge step.
If more than one node provided to nodelist, one task jobs will complain and/or not be queued
It will be wonderful if you can or we can supply a version to support qsub based HPC. I think that is not so difficult to do.
replace %x by actual jobnames, this only works with sbatch
Hello,
While running the program, I encountered this warning message in the log file:
Warning message:
`graph.data.frame()` was deprecated in igraph 2.0.0.
ℹ Please use `graph_from_data_frame()` instead.
The program works fine. Just an FYI perhaps for future updates? Thank you!
First remove low occurrence OTUs
Then remove low abundance samples
The other way around can maintain samples with low abundance when most OTUs in these samples have low occurrence OTUs, which are subsequently removed
Hello,
First of all thanks for the code and package! It is something I've been thinking of and trying to do, and love to see there have been work done in the past.
For the input OTU table, I was wondering if it only considers read counts data? We all know that many potential biases could be introduced during the PCR process and bioinformatics pipeline. Therefore, for many metazoan metabarcoding studies, people convert the read counts data to present/absent data (1 vs. 0) for downstream analyses. So, I am curious about what approaches this code takes.
Hello @lentendu,
Thank you again for creating the script, and I have been making progress and getting results along the way. I have a quick question about the -m null_model
when using the permatfull
function from vegan. Although I have been looking for documentation regarding different options for fixedmar
, almost none talked about the rationale of choosing one over the other. E.g., when to choose row
vs. column
vs. both
.
Therefore, I was wondering if you have had experienced with the options before? Also, I was curious of what you think about the options, for example, which one is better suited for analyzing OTU table? Thank you!
Could be a limiting factor on some HPC.
Allow to fix the maximum memory per job per CPU for parallel and array jobs
Default is 1%
Option -p for percent
-m option for null model
-n option for normalization
ratio: count ratio scaled to sum to same total (as defined by option -d) in each sample and rounded
log_ratio: ratio, then log transform (log method of decostand), then scale
sqrt_ratio: ratio, then sqrt transform (aka. hellinger transformation), then scale
no: no normalization
Hello!
It has come to my attention that there seem to be two hdf5 packages available in R. One is hdf5r, while the other one is rhdf5. In the readme
file, you mentioned that we should install the hdf5r
package. While in the script, it is trying to read R library calling library(rhdf5)
. I was wondering if you could clarify this a bit more? Thank you!
Hello,
While I was trying the -m columns
option for choosing the null model, I encountered the below error code which causes the job submitted showing a status of DependencyNeverSatisfied
Error in match.arg(fixedmar, c("none", "rows", "columns", "both")) :
'arg' must be NULL or a character vector
Calls: permatfull -> match.arg
Execution halted
I was wondering if you could help look into this? Thank you!
Compute observed and null matrices Spearman's rho for multiple seeds in each array job to reduce time associated with job queueing (initialisation, completion).
For example, compute 10 seeds per array for matrices with less than 1000 OTUs.
The runtime should almost reach (but not overpass) the maximum time limit for the short queue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.