Giter VIP home page Giter VIP logo

micnettoolbox's Introduction

GitHub license Python 3.9.6

MicNet Toolbox: visualizing and deconstructing a microbial network

Introduction

This project contains a set of tools to detect, analyze and visualize microbial interaction networks from compositional data. The main objective was to implement different tools of the different algorithms to reconstruct, analyze and visualize a co-ocurrence network with microbial interactions. For that, we use standard algorithms for compositional data processing (SparCC), Network or Graphics Algorithms and Umap Algorithms.

This project is part of the collaborative research between Ixulabs and the Laboratory of Experimental and Molecular Evolution, Institute of Ecology, UNAM.

Project Overview


The project is divided into three parts:

  • Visualization
  • SparCC
  • Network Algorithms

Note: Each package has a REAME.md file with a specific description of how the code works.

You can use the free, but with limited capacity, MicNet dashboard

Compositional data cannot be processed or worked like normal data, it has particular geometric and statistical properties. Using different transformations (Dirichlet transformation, Normalization or CLR) and the original data we estimate a mapping through the UMAP algorithm to the Hyperbolic Space. This to have a visualization of the data and its possible interactions. Furthermore, through the HDBSCAN algorithm we detect clusters based on density and which points are possible outliers and noise.

This algorithm is standard for estimating correlation values from compositional data, in order to infer the network's connections. We made several modifications to the original version of the algorithm to improve the ability to process large samples of data. Part of the stages were parallelized and the execution of the algorithm and its statistical tests were better controlled. To run this on your local please refer to the README.md in the sparcc folder.

Network analyses were used to characterise both the overall structure and the local interactions of the microbial network, in which each OTU was represented as a node and the correlations found by SparCC as undirected weighted edges, such that an edge between two nodes implies a relationship between the two corresponding OTUs. Given that most network analyses can only handle positive interactions, we normalized the SparCC correlation matrix from -1 to 1 to a range from 0 to 1, except for the structural balance analysis which directly uses the positive and negative correlation values. The dashboard includes the calculation of large scale metrics of the network, structural balance analysis and community/HDBSCAN subnetwork analysis. To run percolation analysis and topology comparison please refer to the README.md in the network_alg folder.

Local use of dashboard

If you do not wish to use the web app of MicNet which has limited capacity, and you would prefer using your own computer disk and RAM resources, all you have to do is follow these steps. To do this, you must first have the conda package manager installed. If you do not have it installed, we recommend that you install the miniconda or anaconda packages. You can find the instructions to install it at this link:

Miniconda

Anaconda

Once you have conda working on your computer the process is the following:

  1. First be sure to be situated in the MicNet repository folder that you have clone into your computer. Then, the first step to run the dashboard is to create the environment, thus, first you must run the following code in the command line or terminal:

    conda env create -f environment.yml 

    Be sure to check if the environment was created by typing in your command line the following:

    conda env list 

    You should see MicNet-env listed among your environments.

  2. If the MicNet-env environment is in the list, you then need to activate it using the following code:

    conda activate  MicNet-env
  3. Finally, you just need to run the following code to get the web app running, NOTE that this app will be using your computer's computational resources (disk and RAM), thus giving you more power than the one we provide with free memory resources here.

    streamlit run app.py

    The application will open in your browser at the port: localhost:8501

Installation of micnet package

To use the micnet package you can install it via pip. However, you must first create the conda environment as described for the dashboard usage:

streamlit run app.py

Then you can install and use the micnet package:

pip install micnet

Data Details

As validation data, we use the Kombucha data set described in Arikan et al., (2020). .All data can be find in the folder named "Data". The "Kombucha_abundance_table.txt" can be used as input to run the UMAP/HDBSCAN and SparCC modules. We have also included the co-occurence matrix ("Kombucha_Sparcc_matrix.csv") and the HDBSCAN output file ("Kombucha_HDBSCAN.csv") which can be used as input to test the network module. The kombucha example can be easily inspected using the web dashboard at MicNet dashboard.

As a case study we used the Domos Archean data set described in Espinosa-Asuar et al. 2021, a database with more than 2,000 OTUs. All data can be find in the folder named "Data". The "Domos_abundance_table.txt" can be use as input for the UMAP/HDBSCAN and Sparcc modules. We have also provided the co-ocurrence matrix in the file "Domos_Sparcc_matrix.csv" and the HDBSCAN output file ("Domos_HDBSCAN.csv") which can be input to test the network module. Given the size of the Domos dataset, to run this example you should run the dashboard locally in your computer, it cannot be run in the web version.

Licenses

MIT License.

micnettoolbox's People

Contributors

dlegor avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.