Giter VIP home page Giter VIP logo

q2-makarsa's People

Contributors

benkaehler avatar isaactowers avatar nbokulich avatar rhernandvel avatar zakir-hossine avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

q2-makarsa's Issues

Create conda packages and implement CI

Dependencies: #4

Prerequisites:

  1. Understanding of GitHub actions.
  2. Understanding of conda packaging.

Create a conda package for distribution of the necessary R dependencies and the plugin itself. Add continuous integration.

Will need to figure out how to get SpiecEasi to install via conda.

Here: https://github.com/BenKaehler/q2-SpiecEasi/actions/new?category=continuous-integration.

Example: https://github.com/qiime2/q2-dada2/blob/master/ci/recipe/meta.yaml and https://anaconda.org/kaehler/q2-clawback.

Create Network semantic type

Dependencies: #1

Prerequisites:

  1. Python programming.
  2. Understanding of QIIME 2 semantic types and how to create new ones.

Create a Network QIIME 2 semantic type. It should provide transformers for reading and writing the igraph data structures saved be the R script.

Outdated example with lots of explanatory comments: https://github.com/qiime2-graveyard/q2-dummy-types/tree/master/q2_dummy_types.

Real-world working example: https://github.com/qiime2/q2-dada2/blob/master/q2_dada2/_stats.py and https://github.com/qiime2/q2-dada2/blob/master/q2_dada2/_transformer.py. Background: https://dev.qiime2.org/latest/storing-data/.

Run time and memory requirements

Hi,

thanks for putting this together! ๐Ÿš€

I've been trying to apply it to my FeatureTable of 323 samples, ~5000 features, and ~2,900,000 total frequency. Running it with 8 cores it crashed after 3 days when the memory usage reached 60 GB. Do you have any recommendation for the number of cores to use, and maybe an estimate for the corresponding run time and memory requirements?

Thanks a lot! ๐Ÿ™Œ

Best,
Lena

Enable pulsar batch mode

This is a note that I've taken the config file pulsar parameter out of the inputs, because it breaks qiime2 provenance tracking.

Could be added in future, but would need extra work. Perhaps a new config file semantic type would be needed, if it doesn't already exist.

Add metadata filtering for FlashWeave

FlashWeave seems to be sensitive to having exactly matching metadata, so we can probably put something in _flashweave.py to encourage compatibility before sending it for analysis.

All about weights

As of right now, Louvain community detection uses edge weights, but centrality calculations discard weight information.

It would be better if weight information were used consistently. That is Louvain community detection should optionally allow unweighted calculation, and centrality calculations should use weights by default but optionally allow unweighted calculations.

But there are some complications. I'll collect them here to help with future unravelling.

SpiecEasi MB, SpiecEasi Glasso, and FlashWeave all return different "weights". They appear to be:

  • MB - beta coefficients from lasso regressions, so they probably control for "by-stander effects" as they call it them in the FlashWeave paper
  • Glasso - straight-up correlation, so doesn't control for by-stander effects, except that it's set to zero if it's not significant using an algorithm that does account for by-stander effects
  • FlashWeave - parameters from their model that attempts to grow networks that control for by-stander effects, so opaque in terms of how they should be compared or interpretted.

So while there is some doubt about the specific interpretation of each weight, they all seem to be "correlation-like". That is, larger in absolute value implies a stronger connection. I compare correlation-like to "distance-like", where a stronger connection would be implied by a smaller value.

Reviewing how weights are handled in our centrality statistics:

  • degree - not weighted
  • betweenness - weights are interpretted as distances
  • closeness - weights are interpretted as distances
  • eigenvector - weights are interpretted as connection strength
  • associativity - weights degree is replaced by the sum of the weights incident on a node

So correlation-like weights are probably appropriate for the latter two, but should be flipped for the second and third. For the first it doesn't matter.

ENH: `visualise-network`: add overview tab (with all groups)

In the visualise-network visualization it would be nice to have an overview tab with all groups (even if it is just static) for an easier overview (and condensed figure for publication). Switching between tabs makes it difficult to compare groups if there are many.

Likewise, such a tab could give a nice comparison of the network topologies. This overview tab could display a table with average network topology metrics for each group (and an additional stat test?). This would not need to be dynamic of course, as the groups would not change, so a static table would suffice.

Write Python script to create network visualisations

Dependencies: #1, #2

Prerequisites:

  1. Python programming.
  2. Javascript programming and HTML.
  3. Understanding of QIIME 2 visualisations.
  4. Understanding of igraph output format.

Write a Python script to create the visualisation that takes the Network semantic type and maybe the original table as inputs and displays an interactive network. It would be great if we could produce an interactive display of the network using the d3 javascript library.

Example: https://github.com/ConstantinoSchillebeeckx/q2-phylogram/blob/master/q2_phylogram/plugin_setup.py.

Add community detection to the JOSS paper

The new Louvain community detection functionality needs to be mentioned in the JOSS paper.

  • Add authorship details (name and affiliation) to JOSS paper.
  • Add brief rationale for including community detection to Statement of need.
  • Add brief description of Louvain community detection to Summary of functionality.

Please create a PR to the joss-paper branch with appropriate amendments to paper.md and joss.bib.

ENH: increase size/move download png button

In the visualise-network visualization the download as png button is a bit hidden. Could this be made more prominent? slightly larger and maybe at the top of the aesthetics controls?

ENH: lock visualization settings when switching between tabs

In the visualise-network visualization, it would be nice if the aethetics settings remained locked when switching between tabs for easier comparison of groups. Actually I would say this is essential to ensure consistency if preparing a figure for publication (otherwise it would be easy to accidentally introduce inconsistencies if created a figure with all groups)

node IDs do not correspond to feature IDs

when putting in #78 and #79 I decided to try using various makarsa outputs with other QIIME 2 actions to test the new functionality.

The outputs of louvain-communities did not look quite like what I expected. I had understood that the node map would map features to modules... but the node and module IDs are both arbitrary IDs. This prevents the node map from having useful applications, e.g., to annotate or collapse features based on module identity.

@BenKaehler @rhernandvel is this expected? Shouldn't node IDs correspond to feature IDs? Are the feature IDs being replaced by arbitrary node IDs in louvain-communities, or are these the node labels in the input Network?

To reproduce:

Using the outputs from the readme tutorial, this action will show you the node and module IDs:

qiime metadata tabulate \
        --m-input-file node-map.qza \
        --o-visualization node-map.qzv

and this action fails, because the node IDs do not actually correspond to feature IDs:

qiime feature-table group \
	--i-table sponge-feature-table.qza \
	--p-axis feature \
	--m-metadata-file node-map.qza \
	--m-metadata-column COMMUNITY \
	--p-mode sum \
	--o-grouped-table grouped-table.qza

Add weights to edges

This could be done in the R script:


library(Matrix)
secor  <- cov2cor(getOptCov(se.gl.amgut))
sebeta <- symBeta(getOptBeta(se.mb.amgut), mode='maxabs')
elist.gl     <- summary(triu(secor*getRefit(se.gl.amgut), k=1))
elist.mb     <- summary(sebeta)
elist.sparcc <- summary(sparcc.graph*sparcc.amgut$Cor)

hist(elist.sparcc[,3], main='', xlab='edge weights')
hist(elist.mb[,3], add=TRUE, col='forestgreen')
hist(elist.gl[,3], add=TRUE, col='red')

(taken from the SpiecEasi README)

Conversion of NodeMap to Metadata failing

Hi @rhernandvel and @nbokulich, could you please have a quick look at this?

There might be a quick fix that you can see straight away.

This was my shell session:

$ qiime makarsa louvain-communities --i-network-input pd-mouse-network.qza --o-community-out louvain-nodes.qza
Saved NodeMap to: louvain-nodes.qza
$ qiime makarsa visualise-network --i-network pd-mouse-network.qza --m-metadata-file louvain-nodes.qza --o-visualization louvain-network.qza
There was an issue with viewing the artifact 'louvain-nodes.qza' as QIIME 2 Metadata:

  Artifacts with type NodeMap cannot be viewed as QIIME 2 metadata.

I can see the metadata registrations for NodeMaps in the plugin setup, so I guess some small component is missing.

So that you can reproduce the issue I've included the input. (I had to zip it because github doesn't like Q2 artifacts.)

metadata merging issues

So currently if you input multiply metadata files to to visualise-network, and metadata is missing for one of the nodes in one of the files, then you won't be able to see the metadata for that node that was in any of the other files.

This relates to how QIIME 2 merges metadata. There is a fix coming, which will require users to merge metadata using, say, an outer join before feeding it visualise-network. The fix will be after the next QIIME 2 release, however.

It would be good to figure out a work-around in the meantime, perhaps implement our own merge method until the official one is available.

Write R Script

Prerequisites:

  1. R programming.
  2. Understanding of SpiecEasi.
  3. Understanding of QIIME 2 FeatureTable[Frequency] type, and how to export into a format that can be imported into R.

Write an R script that loads a table exported from a FeatureTable[Frequency] type table and saves down an igraph when it has completed. It should expose the parameters of a normal call to SpiecEasi.

Example: https://github.com/qiime2/q2-dada2/blob/master/q2_dada2/assets/run_dada.R.

An error was encountered while running FlashWeave in Julia (return code 127)

Hi,

thanks so much for wrapping FlashWeave - I'm super excited about this plugin! ๐Ÿš€

I installed q2-markarsa in a fresh qiime2-2023.2 environment and tried running it with my FeatureTable, without modifying any of the optional parameters but got the error:
An error was encountered while running FlashWeave in Julia (return code 127)

I saw that it might mean that the command is not found in Julia?

I hope it's okay that I'm already trying to use FlashWeave! ๐Ÿ™Œ

Cheers!
Lena


This is the complete error message:

Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: run_FlashWeave.jl --datapath /scratch/lfloerl/tmpdata/tmp2ah97p88/input-data.tsv --output /scratch/lfloerl/tmpdata/tmp2ah97p88/network.gml --max_k 3 --alpha 0.01 --conv 0.01 --max_tests 1000000 --hps 5 --n_obs_min -1 --time_limit -1.0 --prec 64 --update_interval 30 --verbose --sensitive --feed_forward --FDR --normalize --make_sparse

/usr/bin/env: julia: No such file or directory
Traceback (most recent call last):
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/q2_makarsa/_flashweave.py", line 84, in flashweave
    run_commands([cmd])
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/q2_makarsa/_run_commands.py", line 19, in run_commands
    subprocess.run(cmd, check=True)
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_FlashWeave.jl', '--datapath', '/scratch/lfloerl/tmpdata/tmp2ah97p88/input-data.tsv', '--output', '/scratch/lfloerl/tmpdata/tmp2ah97p88/network.gml', '--max_k', '3', '--alpha', '0.01', '--conv', '0.01', '--max_tests', '1000000', '--hps', '5', '--n_obs_min', '-1', '--time_limit', '-1.0', '--prec', '64', '--update_interval', '30', '--verbose', '--sensitive', '--feed_forward', '--FDR', '--normalize', '--make_sparse']' returned non-zero exit status 127.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/q2cli/commands.py", line 352, in __call__
    results = action(**arguments)
  File "<decorator-gen-398>", line 2, in flashweave
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/scratch/lfloerl/.condaenvs/qiime2-2023.2-new/lib/python3.8/site-packages/q2_makarsa/_flashweave.py", line 86, in flashweave
    raise Exception(
Exception: An error was encountered while running FlashWeave in Julia (return code 127), please inspect stdout and stderr to learn more.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.