Giter VIP home page Giter VIP logo

Comments (3)

sbslee avatar sbslee commented on August 22, 2024

@rupesh-sinha,

Thank you for the feature request!

The reason I didn't include options like where to the beta_2d_plot function is because I was afraid the option might be abused by users, both intentionally and unintentionally. As you may already know, when presenting a PCoA plot you are supposed to show all samples that were used to calculate the original distance matrix. If a given PCoA plot is showing only a subset of samples, then it should not be used to make any definitive conclusions. For example, when you try to subset some samples in an emperor QZV file using QIIME 2 View, it will still do it for you but with a warning: "WARNING: hiding samples in an ordination can be misleading". Below you will see that I removed "left palm" samples (blue). The warning is displayed on the top left corner.

Screen Shot 2022-12-01 at 3 45 37 PM

Therefore, if the goal is to see how PCoA looks in the absence of certain samples, then you should calculate a new distance matrix and plot it again. This can be easily done within Dokdo and an example of this is illustrated in the ordinate function's documentation (compare the first and second figures). I will show the code example below:

import dokdo
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set()

qza_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/table.qza'
metadata_file = '/Users/sbslee/Desktop/dokdo/data/moving-pictures-tutorial/sample-metadata.tsv'

pcoa_results = dokdo.ordinate(qza_file)

dokdo.beta_2d_plot(
    pcoa_results,
    metadata=metadata_file,
    hue='body-site',
    figsize=(8, 8)
)

plt.tight_layout()

ordinate-1

from qiime2 import Metadata

mf = dokdo.get_mf(metadata_file)
mf = mf[mf['body-site'].isin(['gut', 'left palm'])]

pcoa_results = dokdo.ordinate(qza_file, metadata=Metadata(mf))

dokdo.beta_2d_plot(
    pcoa_results,
    metadata=metadata_file,
    hue='body-site',
    figsize=(8, 8)
)

plt.tight_layout()

ordinate-2

I think there could still be an argument to be made in the favor of adding options like where to the beta_2d_plot function. For example, the user may want to explore the PCoA space in greater depth. However, if that's the case the user should use QIIME 2 View because that's why the platform exists in the first place.

Hope this helps!

from dokdo.

rupesh-sinha avatar rupesh-sinha commented on August 22, 2024

Thanks @sbslee

This works perfectly.
I understand the concern for hiding some samples and was looking for something like this to recompute the distance matrix for the selected samples. Thanks for your excellent support with dokdo!

When I recompute using the function you suggested in the second part of the script I get a plot with warning message like this:
/home/rupesh/anaconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/sklearn/metrics/pairwise.py:1776: DataConversionWarning: Data was converted to boolean for metric jaccard
warnings.warn(msg, DataConversionWarning)

What does this actually mean?
Again thanks for your support.

from dokdo.

sbslee avatar sbslee commented on August 22, 2024

@rupesh-sinha,

Glad to know that Dokdo is helpful to you!

As for the warning, you don't have to worry about it. It simply means that numeric data was converted to boolean (e.g. 442 --> True, and 0 --> False). This conversion was needed because you are using the Jaccard distance as distance metrics, which computes distance between samples based on the presence/absence of microbiome instead of using their abundance. Hope this helps.

from dokdo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.