saezlab / liana-py Goto Github PK
View Code? Open in Web Editor NEWLIANA+: an all-in-one framework for cell-cell communication
Home Page: http://liana-py.readthedocs.io/
License: GNU General Public License v3.0
LIANA+: an all-in-one framework for cell-cell communication
Home Page: http://liana-py.readthedocs.io/
License: GNU General Public License v3.0
Hi, liana team
in liana R version, there is 'LIANA across Conditions with cell2cell-Tensor'.
Is it possible to add this feature (or even other features) to liana-py?
Since sc dataset can be run more efficiently in python and liana is such a useful tool.
Thank you.
TODO: add a tutorial on metabolite-mediated CCC using MetalinksDB
I used pip to install LIANA, the installation went smoothly. However, when import it shows
Traceback (most recent call last):
File "/opt/miniconda3/envs/python37/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_2999548/1757714486.py", line 1, in <module>
import liana as li
File "/opt/miniconda3/envs/python37/lib/python3.7/site-packages/liana/__init__.py", line 4, in <module>
from liana import method as mt, plotting as pl, resource as rs, utils as ut, multi as mu, funcomics as fun
File "/opt/miniconda3/envs/python37/lib/python3.7/site-packages/liana/method/__init__.py", line 1, in <module>
from ._Method import Method, MethodMeta, _show_methods
File "/opt/miniconda3/envs/python37/lib/python3.7/site-packages/liana/method/_Method.py", line 133
for sample in (progress_bar := tqdm(samples, disable=not verbose)):
^
SyntaxError: invalid syntax
I believe the error show up because the operation ":=" is not added into python until 3.8. I created another environment with python 3.8, the problem was solved. Can you fix this to make Liana suitable for 3.7?
Describe the bug
Hi !
I'm currently running the Steady-state Ligand-Receptor inference vignette from (https://liana-py.readthedocs.io/en/latest/notebooks/basic_usage.html).
I noticed a bug when running the second dotplot from the Rank Aggregate section.
I obtained :
Instead of :
It should be noted that i change filter_fun parameter by filter_lambda parameter from li.pl.dotplot() function. #80
Best regards
Maxime
Currently permutation-based methods like cellchat
and cellphonedb
need have at least n_perms=1
to be able to run. Since in some situations we don't care about the permuted score or p-val, just the first estimate, it would be good to make this methods handle n_perms=1
The solution could be that the permuted score should be 1 in case of p-values and for z-scores you could set them to 0. Alternatively to nans.
Additional checks and transformation to categories for obs columns: groupby
& sample_key
If either column is not categorical, liana may throw an exception
Currently scanpy is being used as a core dependency of LIANA.
However, in reality it's used only in a couple of places, the most problematic being the calculation of DEGs within _liana_pipe.py
.
If these are changed, then scanpy can be removed from the core dependencies.
Hi Daniel,
Thank you for the fantastic work!
I was looking into applying the hypothesis-testing workflow onto some of my samples and was wondering if you could offer some clarification.
When I run li.multi.df_to_lr after the DEA, is there any way to specify sampleID? If I have many different samples for each condition, is there a way for liana to ensure only sender and receiver cells from the same sample are used, as far as I see this approach is only used in the Tensor C2C or MOFA workflows with li.mt.rank_aggregate.by_sample, but not in this workflow. Would you recommend just filtering the adata to only include each sample one at a time before running liana?
Thank you,
Ido
Hi
I am trying to replicate the results in this link:
https://liana-py.readthedocs.io/en/stable/notebooks/targeted.html
but got this error:
AttributeError: module 'corneto.methods.carnival' has no attribute 'select_mip_solver'
when I run the same code in your link
df_res, problem = li.mt.find_causalnet(
prior_graph,
input_scores,
output_scores,
node_weights,
# penalize (max_penalty) nodes with counts in less than 0.1 of the cells
node_cutoff=0.1,
max_penalty=1,
# the penaly of those in > 0.1 prop of cells set to:
min_penalty=0.01,
edge_penalty=0.01,
verbose=True,
max_seconds=60*3
)
I appreciate any suggestions.
Hi, just a brief question, could you explain why version 0.1.9 removed the steady_rank metric, and how the magnitude_rank differs from that?
I had been conducting analyses previously with a cutoff value of "steady_rank" to determine significant interactions, since the interest is in confident interactions regardless of cell type specificity, and want to choose an appropriate replacement.
Thanks
I was looking for a way to limit plotting using li.pl.dotplot
function to a selection of ligand/receptor complexes. But the plot function does not seem to offer that. What are my options?
Should include MuData to setup.py (currently liana's installation will complain).
Should also consider adding decoupler-py>=1.4.0 as it's used throughout most tutorials (and some internal functions depend on it).
Hello,
I wanted to ask for clarification regarding the output of the causal network vignette "Hypothesis-testing for CCC & Downstream Signalling Networks".
Here is my current understanding:
column name (observed values when only a few) - description
source - source node in the PPI, there are some values here that I don't understand ("_s","_pert_c0","_meas_c0","")
source_type (unmeasured,input) - are the input nodes the receptors that were identified in the previous step? are unmeasured genes that have no expression? or genes that are not part of that receptor group?
source_weight - not sure what this is, the example only has two values (0 and -4.66)
source_pred_val (1,0,-1) - node is upregulated, downregulated, is zero no differential expression?
target - target node in the PPI, there are some values here that I don't understand ("_s","_t","_meas_c0","")
target_type (unmeasured, output) - similar to "source_type" except "output" would be TFs?
target_weight - not sure what this is? the example has a few more values than "source_weight", they are all 0 or positive.
target_pred_val (1,-1) - node is upregulated, downregulated? no zeros in this one
edge_type (1,-1,0) - unsure what this is
edge_pred_val(1,-1) - whether the edge upregulates or downregulates the target node.
Thank you in advance!
Edgar
I should add a class to define misty's models, and enable external object models to be passed, e.g. asserting and handling the features, passing kwargs
Hi,
According to the documentation, 'expr_prop' corresponds to the Minimum expression proportion for the ligands/receptors (and their subunits) in the corresponding cell identities. It is unclear to me what this means.
a) Is it referring to the ratio of expression between ligand / receptor in say Cell type 1? i.e. 0.5 would mean the receptor is twice as expressed as the ligand in Cell type 1 on average (for example, 3/6; ligand/receptor).
b) Is it referring to the minimum proportion of cells within Cell type 1 that express the ligand and the receptor? i.e. 0.5 would mean that both ligand and receptor are expressed in 50% of cells of Cell type 1.
Hi, liana team
glad to see python version of liana. and it is very helpful to my project.
can you give a tutorial on how to run liana-py on other species like mouse?
It seemed that liana R version moved further than the python version.
Thank you.
Wulin
Hello,
To make my analysis reproducible, I am creating an environment for my decoupler analyses and I am not able to run the liana quantification anymore. I am using the code described in the doc to perform this analysis:
liana_lr = ln.resource.select_resource()
liana_lr = ln.resource.explode_complexes(liana_lr)
# Create two new DataFrames, each containing one of the pairs of columns to be concatenated
df1 = liana_lr[['interaction', 'ligand']]
df2 = liana_lr[['interaction', 'receptor']]
# Rename the columns in each new DataFrame
df1.columns = ['interaction', 'genes']
df2.columns = ['interaction', 'genes']
# Concatenate the two new DataFrames
liana_lr = pd.concat([df1, df2], axis=0)
liana_lr['weight'] = 1
# Find duplicated rows
duplicates = liana_lr.duplicated()
# Remove duplicated rows
liana_lr = liana_lr[~duplicates]
I get the following error message:
'liana.resource' has no attribute 'explode_complexes'
It seems that the code of liana select_resource was updated last month. I have the impression that the function ln.resource.explode_complexes is not accessible anymore.
Have you encountered a similar issue?
I am using the following versions:
decoupler 1.4.0
liana 1.0.3
Thanks for your help.
I ran the following code incorrectly:
prior_graph = li.method.build_prior_network(input_pkn, input_scores, output_scores, verbose=True)
The content of the error is:
AttributeError: module 'liana.method' has no attribute 'build_prior_network'
@deeenes I keep finding strange interactions that occur as PPIs but not actual CCC and are reported as such in CellTalkDB - e.g. SORBS1-INSR or interactions with SMAD3. Might be best if we exclude CellTalkDB from the curated resources list in OmniPath, and hence LIANA.
Hello !
I'm currently running the Steady-state Ligand-Receptor inference vignette from (https://liana-py.readthedocs.io/en/latest/notebooks/basic_usage.html).
When running li.pl.dotplot() and li.pl.tileplot() from the vignette i have the following errors :
li.pl.dotplot(adata = adata,
colour='lr_means',
size='cellphone_pvals',
inverse_size=True, # we inverse sign since we want small p-values to have large sizes
source_labels=['CD34+', 'CD56+ NK', 'CD14+ Monocyte'],
target_labels=['CD34+', 'CD56+ NK'],
figure_size=(8, 7),
# finally, since cpdbv2 suggests using a filter to FPs
# we filter the pvals column to <= 0.05
filter_fun=lambda x: x['cellphone_pvals'] <= 0.05,
uns_key='cpdb_res' # uns_key to use, default is 'liana_res'
)
Error :
TypeError: dotplot() got an unexpected keyword argument 'filter_fun'
I noticed that filter_fun parameter have been change as filter_lambda.
I get the same type of error when running li.pl.tileplot() :
my_plot = li.pl.tileplot(adata = adata,
# NOTE: fill & label need to exist for both
# ligand_ and receptor_ columns
fill='means',
label='props',
label_fun=lambda x: f'{x:.2f}',
top_n=10,
orderby='cellphone_pvals',
orderby_ascending=True,
source_labels=['CD34+', 'CD56+ NK', 'CD14+ Monocyte'],
target_labels=['CD34+', 'CD56+ NK'],
uns_key='cpdb_res', # NOTE: default is 'liana_res'
source_title='Ligand',
target_title='Receptor',
)
my_plot
Error :
TypeError: tileplot() got an unexpected keyword argument 'source_title'
source_title and target_title parameters have been removed
So i think this vignette and function documentation in the API are not up to date.
Thanks a lot.
Best regards.
Maxime
Hi, I would really like a way to print the percent expression and CPM values for the (minimally expressed) ligand and receptor in the results dataframe for li.mt.rank_aggregate() or any of the individual methods (which I know also have the supp_cols argument - maybe it is possible there? but rank_aggregate() seems to lose supp_cols)
The ones of interest would be something like:
ligand_percent
receptor_percent
ligand_cpm
receptor_cpm
It'd improve the interpretability of the results a lot - for instance, when return_all_lrs=True and an interaction fails the tests, it gives insight into why it fails, because of the ligand or receptor. Also, it allows for simple TPM threshold-based interaction calling - the most rudimentary method discussed by Armingol et al. 2021 (https://doi.org/10.1038/s41576-020-00292-x), but still a useful one. I don't know if you think that CPM thresholding would work as a "method" to liana, but either way least returning this info in the dataframe would be nice.
Thanks for any input!
Right now LR results are not sorted by default, this can cause problems if users are not aware of it and just take the top N rows of the returned dataframe without checking the stats values.
A solution could be to sort the obtained results by the default stat of each method, be it magnitude or specificity score.
Is it possible to add a parameter **kwargs
to
liana-py/liana/multi/to_tensor_c2c.py
Lines 22 to 34 in 37a0be1
So we can pass them to tensor-cell2cell directly here and control other properties when building the tensor:
liana-py/liana/multi/to_tensor_c2c.py
Lines 127 to 132 in 37a0be1
For example, parameters such as how
, outer_fraction
, among others could be now controlled.
Need to update the way that logging is done in liana -> write one function with logging
to handle different cases.
Hi @dbdimitrov,
this is to follow up our discussion on methods for differential cell2cell communication in theislab/single-cell-best-practices#140.
In your NatComm paper and the CCC book chapter, you seem to focus mostly on methods for inferring steady-state cell2cell interactions. However, recently, I tend to encounter more datasets where I actually want to compare cell2cell communication between conditions.
For instance, in our lung cancer atlas paper, we performed comparisons between the LUAD and LUSC subtypes, or different patient subgroups, driven by the question "how could a certain cell-type shape the tumor microenvironment by expressing certain cell surface ligands?".
I am currently aware of the following methods to compare between conditions:
All these approaches suffer from the same pseudoreplication-bias problem as single-cell DE in general, so one would need to either use pseudo-bulk or LME models to avoid inflated p-values.
In the atlas (and also for other projects), so far I have been using DESeq2 on Pseudobulk and a custom Python reimplementation of the cellphonedb v3 algorithm (https://github.com/icbi-lab/nmp-liver/blob/main/lib/scanpy_helper/scanpy_helpers/cell2cell.py) combined with a list of ligands and receptors from omnipath. I ended up with the following visualization, which shows differentially expressed ligands between conditions at the top, and other cells that express the corresponding receptors (i.e. might be impacted by those changes) at the bottom:
Do you have any other suggestions/recommendations how to best compare cell-cell interactions between conditions? And do you plan to implement any of these methods in liana-py?
Dotplot of ligand-receptor interactions across samples
Required to e.g. supplement analysis with Tensor-cell2cell
Hi I updated to the new release of liana-py 0.1.7
and am getting an error with import liana, I did not have this error with the previous release on the same system and python environment
Python 3.7.2 (default, Dec 18 2019, 16:55:18)
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import liana
Traceback (most recent call last):
File "", line 1, in
File "/.mounts/labs/abelsonlab/public/Ido/python_venvs/liana_venv/lib/python3.7/site-packages/liana/init.py", line 4, in
from liana import method as mt, plotting as pl, resource as rs, utils as ut, multi as mu, funcomics as fun
File "/.mounts/labs/abelsonlab/public/Ido/python_venvs/liana_venv/lib/python3.7/site-packages/liana/method/init.py", line 1, in
from ._Method import Method, MethodMeta, _show_methods
File "/.mounts/labs/abelsonlab/public/Ido/python_venvs/liana_venv/lib/python3.7/site-packages/liana/method/_Method.py", line 133
for sample in (progress_bar := tqdm(samples, disable=not verbose)):
^
SyntaxError: invalid syntax
Some MILP problems result in multiple equally good (local) solutions. A non-arbitrary way (independent of random state) to deal with those is to create a union of them.
SCIPy will additionally not converge properly with larger problems (now more notable due to recent update and size increase of the OmniPath PPI).
Hi,
Sorry if this information is available somewhere but I cannot find it. I would like to know what version of cellphoneDB (and other databsases) you have inside the liana-py package.
I am aware you import your databases from OmniPathR (unless I am mistaken) and have checked their version of cellphoneDB: saezlab/OmnipathR#85
I wanted to double check that your version matches theirs and if not could it be updated?
Thanks,
Simon
Describe the bug
Hi and thanks for a nice tool. For example, Cxcl9 is missing as ligand in the database, while Cxcl10 is there, and they have many receptors in common. Converting CXCL9 to Cxcl9 should be doable? I'm guessing something has gone wrong and that many interactions are missing in the mouse data.
To Reproduce
Just look in the database
Currently, the documentation is repetitive and with redundant parameter descriptions.
docrep
can be used to store all repetitive parameters as constants - e.g. as done in squidpy:
https://github.com/scverse/squidpy/blob/453015014a4b875db79a84e9ead9c0c59d72d909/squidpy/_docs.py#L11
Describe the bug
Lianapy is incompatible with numpy 1.24.x because "float" is no longer importable
With numpy version 1.23.5
0.40 of entities in the resource are missing from the data.
Generating ligand-receptor stats for 713 samples and 17467 features
Assuming that counts were `natural` log-normalized!
.venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:419: RuntimeWarning: overflow encountered in power
.venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:412: RuntimeWarning: invalid value encountered in subtract
venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:412: RuntimeWarning: invalid value encountered in subtract
Running CellPhoneDB
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 1136.74it/s]
Running Connectome
Running log2FC
Running NATMI
Running SingleCellSignalR
Running CellChat
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:00<00:00, 187.00it/s]
With version 1.24.x
0.40 of entities in the resource are missing from the data.
Generating ligand-receptor stats for 713 samples and 17467 features
Assuming that counts were `natural` log-normalized!
.venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:419: RuntimeWarning: overflow encountered in power
.venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:412: RuntimeWarning: invalid value encountered in subtract
.venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py:412: RuntimeWarning: invalid value encountered in subtract
Running CellPhoneDB
Traceback (most recent call last):
File "src/interpret/lianapy/script.py", line 63, in <module>
main()
File "script.py", line 41, in main
liana.mt.rank_aggregate(
File ".venv/lib/python3.10/site-packages/liana/method/sc/_rank_aggregate.py", line 139, in __call__
liana_res = liana_pipe(adata=adata,
File ".venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py", line 185, in liana_pipe
_run_method(lr_res=lr_res.copy(),
File "venv/lib/python3.10/site-packages/liana/method/_liana_pipe.py", line 462, in _run_method
_get_means_perms(adata=adata,
File "venv/lib/python3.10/site-packages/liana/method/_pipe_utils/_get_mean_perms.py", line 49, in _get_means_perms
if isinstance(norm_factor, np.float):
File ".venv/lib/python3.10/site-packages/numpy/__init__.py", line 305, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'cfloat'?
To Reproduce
If possible please provide a minimal reproducible example.
Unfortunatly, I am not able to send any data. However, I think this bug can be reproduced by using any random anndata file as long as version of numpy is 1.24.0 or older.
Hi !
I was trying "Hypothesis-testing for CCC & Downstream Signalling Networks" tutorial.
During the "DEA to Ligand-Receptor Interactions" part I confront an issue.
When I run,
lr_res = li.multi.df_to_lr(aadata,
dea_df=dea_df,
resource_name='consensus',
expr_prop=0.1, # calculated for adata as passed - used to filter interactions
groupby='cell_type2',
stat_keys=['stat', 'pvalue', 'padj'],
use_raw=False,
complex_col='stat', # NOTE: we use the Wald Stat to deal with complexes
verbose=True,
return_all_lrs=False,
)
I get warning that "all gene name values are not are not valid obs/ var names or indices."
I checked, that aadata's var names has all genes.
I am attaching a screenshot of the situation.
I wonder what went wrong.
Best regards
Hi! I have tried to use the method rank_aggregate() and cellphonedb() in Liana. Both have the below issue. I checked my data as shown below. Wonder how this key error always happens. Thank you!
li.mt.rank_aggregate(adata, groupby='celltype.lv1', expr_prop=0.1, verbose=True, key_added='cpdb_res')
cellphonedb(adata, groupby='celltype.lv1', expr_prop=0.1, use_raw=False, return_all_lrs=True, verbose=True)
File ~/anaconda3/envs/scenicplus/lib/python3.8/site-packages/liana/method/_pipe_utils/_pre.py:57, in assert_covered(subset, superset, subset_name, superset_name, prop_missing_allowed, verbose)
50 if prop_missing > prop_missing_allowed:
51 msg = (
52 f"Please check if appropriate organism/ID type was provided! "
53 f"Allowed proportion ({prop_missing_allowed}) of missing "
54 f"{subset_name} elements exceeded ({prop_missing:.2f}). "
55 f"Too few features from the resource were found in the data."
56 )
---> 57 raise ValueError(msg + f" [{x_missing}] missing from {superset_name}")
59 if verbose & (prop_missing > 0):
60 print(f"{prop_missing:.2f} of entities in the resource are missing from the data.")
ValueError: Please check if appropriate organism/ID type was provided! Allowed proportion (0.98) of missing resource elements exceeded (1.00). Too few features from the resource were found in the data. [A1BG, A2M, AANAT, ABCA1, ACE, ACKR1, ACKR2, ACKR3, ACKR4, ACTR2, ACVR1, ACVR1B, ACVR1C, ACVR2A, ACVR2B, ACVRL1, ADA, ADAM10, ADAM11, ADAM12, ADAM15, ADAM17, ADAM2, ADAM22, ADAM23, ADAM28, ADAM29, ADAM7, ADAM9, ADAMTS3, ADCY1, ADCY7, ADCY8, ADCY9 ....] missing from var_names
print(adata)
AnnData object with n_obs × n_vars = 24062 × 2000
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'percent.mt', 'time_point', 'RNA_snn_res.0.5', 'seurat_clusters', 'celltype.lv1', 'celltype.lv2', 'time_month', 'scDblFinder.class', 'scDblFinder.score', 'celltype.lv3'
var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable'
obsm: 'X_pca', 'X_umap'
varm: 'PCs'
adata.X, adata.raw.X
(array([[-0.32665763, -0.34369911, -0.19467216, ..., -0.48180683,
2.08207319, -0.05807534],
[ 0.73692432, -0.34369911, -0.19467216, ..., 2.23740636,
1.67791563, -0.05807534],
[-0.32665763, -0.34369911, -0.19467216, ..., -0.48180683,
-0.45570303, -0.05807534],
...,
[-0.32665763, -0.34369911, -0.19467216, ..., 2.89027397,
-0.45570303, -0.05807534],
[ 5.5336266 , -0.34369911, -0.19467216, ..., -0.48180683,
-0.45570303, -0.05807534],
[-0.32665763, -0.34369911, -0.19467216, ..., -0.48180683,
2.08376311, -0.05807534]]),
<24062x32285 sparse matrix of type '<class 'numpy.float64'>'
with 85893851 stored elements in Compressed Sparse Row format>)
LIANA's resources should be directly accessed via Pypath.
Extensive PK graphs with BioCypher would also allow further refining of results and inference.
This is to be added in future updates.
I have been trying to run liana and I get this error. I have checked var_names and they are in the human gene symbol format. Furthermore, I have manually checked the var_names that are printed in the error message as "missing from var_names" are actually present in the adata. I was hoping to get some further guidance on how I could resolve this. Thank you.
li.mt.rank_aggregate(a_immune, groupby='cell_type', expr_prop=0.1, verbose=True)
ValueError: Please check if appropriate organism/ID type was provided! Allowed proportion (0.98) of missing resource elements exceeded (1.00). Too few features from the resource were found in the data. [A1BG, A2M, AANAT, ABCA1, ACE, ACKR1, ACKR2, ACKR3, ACKR4, ACTR2, ACVR1, ACVR1B, ACVR1C, ACVR2A, ACVR2B, ACVRL1, ADA, ADAM10, ADAM11, ADAM12, ADAM15, ADAM17, ADAM2, ADAM22, ADAM23, ADAM28, ADAM29, ADAM7, ADAM9, ADAMTS3, ADCY1, ADCY7, ADCY8, ADCY9, ADCYAP1, ADCYAP1R1, ADGRA2, ADGRB1, ADGRE2, ADGRE5, ADGRG1, ... WNT16, WNT2, WNT2B, WNT3, WNT3A, WNT4, WNT5A, WNT5B, WNT6, WNT7A, WNT7B, WNT8A, WNT8B, WNT9A, WNT9B, XCL1, XCL2, XCR1, YBX1, ZG16B, ZNRF3, ZP3] missing from var_names
When building MistyData
objects, the .X
in each view is transformed into sparse csr_matrix
format. This is not optimal to do in dense matrices with few zeros since the csr_matrix
is going to consume more memory than its dense counterpart. I would maybe check the proportion of 0s and then decide.
I have my own anndata running the through the Jupyter notebook. Everything works fine until the saving the h5ad files after running mofa and below is the error message in detail:
Warning: Output file models/mofacell.h5ad already exists, it will be replaced
Saving model in models/mofacell.h5ad...
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [13], line 1
----> 1 mu.tl.mofa(mdata,
2 use_obs='union',
3 convergence_mode='medium',
4 n_factors=5,
5 seed=1337,
6 outfile='models/mofacell.h5ad',
7 use_var='highly_variable',
8 gpu_mode=True
9
10 )
File /software/team283/tl7/miniconda3/envs/scarches/lib/python3.9/site-packages/muon/_core/tools.py:585, in mofa(data, groups_label, use_raw, use_layer, use_var, use_obs, likelihoods, n_factors, scale_views, scale_groups, center_groups, ard_weights, ard_factors, spikeslab_weights, spikeslab_factors, n_iterations, convergence_mode, gpu_mode, use_float32, smooth_covariate, smooth_warping, smooth_kwargs, save_parameters, save_data, save_metadata, seed, outfile, expectations, save_interrupted, verbose, quiet, copy)
582 data.obsm["X_mofa"] = z
584 # Weights
--> 585 w = np.concatenate([v[:, :] for k, v in f["expectations"]["W"].items()], axis=1).T
586 if use_var:
587 # Set the weights of features that were not used to zero
588 data.varm["LFs"] = np.zeros(shape=(data.n_vars, w.shape[1]))
File /software/team283/tl7/miniconda3/envs/scarches/lib/python3.9/site-packages/muon/_core/tools.py:585, in <listcomp>(.0)
582 data.obsm["X_mofa"] = z
584 # Weights
--> 585 w = np.concatenate([v[:, :] for k, v in f["expectations"]["W"].items()], axis=1).T
586 if use_var:
587 # Set the weights of features that were not used to zero
588 data.varm["LFs"] = np.zeros(shape=(data.n_vars, w.shape[1]))
File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()
File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()
File /software/team283/tl7/miniconda3/envs/scarches/lib/python3.9/site-packages/h5py/_hl/group.py:330, in Group.__getitem__(self, name)
328 oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
329 else:
--> 330 raise TypeError("Accessing a group is done with bytes or str, "
331 " not {}".format(type(name)))
333 otype = h5i.get_type(oid)
334 if otype == h5i.GROUP:
TypeError: Accessing a group is done with bytes or str, not <class 'tuple'>
Hello!
First of all, thank you for this very complete package!
I am trying to run lr_bivar
function on a spatial object but I am running into this error:
ValueError: Observations annot. `var` must have as many rows as `X` has columns (17353), but has 17506 rows.
On note, I am creating the anndata object from an .h5ad file writen from a Seurat object. Nevertheless, I checked var
and X
dimensions and they were the same (17353). I think the error may be related to lr_bivar
creating another anndata object within the function.
I have included all the files needed to create the spatial anndata object in this folder.
Here is my code:
import pandas as pd
import anndata as ad
import liana as li
# Read H5AD
adata = ad.read_h5ad("path_to/CID44971.h5ad")
adata.layers['counts'] = adata.X.copy()
# Fill image info
from pathlib import Path
from matplotlib.image import imread
import json
path = Path("path_to/CID44971_spatial/slide1/")
library_id = "library_id"
adata.uns["spatial"] = {"library_id": {}}
files = dict(
tissue_positions_file = path / "tissue_positions_list.csv",
scalefactors_json_file = path / "scalefactors_json.json",
hires_image = path / "tissue_hires_image.png",
lowres_image = path / "tissue_lowres_image.png",
)
adata.uns["spatial"][library_id]["images"] = dict()
for res in ["hires", "lowres"]:
adata.uns["spatial"][library_id]["images"][res] = imread(
str(files[f"{res}_image"])
)
adata.uns["spatial"][library_id]["scalefactors"] = json.loads(
files["scalefactors_json_file"].read_bytes()
)
positions = pd.read_csv(
files["tissue_positions_file"],
header = None,
index_col = 0,
)
positions.columns = [
"in_tissue",
"array_row",
"array_col",
"pxl_col_in_fullres",
"pxl_row_in_fullres",
]
adata.obs = adata.obs.join(positions, how = "left")
adata.obsm["spatial"] = adata.obs[["x", "y"]].to_numpy()
adata.obs.drop(columns=["x", "y"], inplace = True)
# Liana
li.ut.spatial_neighbors(adata, bandwidth = 200, cutoff = 0.1, kernel = "gaussian", set_diag = True)
li.mt.lr_bivar(adata,
function_name = "cosine", # Name of the function
n_perms = 100, # Number of permutations to calculate a p-value
mask_negatives = False, # Whether to mask LowLow/NegativeNegative interactions
add_categories = True, # Whether to add local categories to the results
expr_prop = 0.2, # Minimum expr. proportion for ligands/receptors and their subunits
use_raw = False,
verbose = True)
And here is the complete error message:
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/liana/method/sp/_lr_bivar.py:86, in SpatialLR.__call__(self, adata, function_name, resource_name, resource, interactions, expr_prop, n_perms, mask_negatives, seed, add_categories, use_raw, layer, connectivity_key, inplace, key_added, obsm_added, lr_sep, verbose)
15 def __call__(self,
16 adata: AnnData,
17 function_name: str,
(...)
33 verbose: Optional[bool] = False,
34 ):
35 \"\"\"
36 Local ligand-receptor interaction metrics and global scores.
37
(...)
83 if `inplace=False`, returns a the above.
84 \"\"\"
---> 86 lr_res, local_scores = super().__call__(
87 mdata=adata,
88 function_name=function_name,
89 connectivity_key=connectivity_key,
90 resource_name=resource_name,
91 resource=resource,
92 interactions=interactions,
93 nz_threshold=expr_prop,
94 n_perms=n_perms,
95 mask_negatives=mask_negatives,
96 add_categories=add_categories,
97 x_mod=True,
98 y_mod=True,
99 x_use_raw=use_raw,
100 x_layer=layer,
101 seed=seed,
102 verbose=verbose,
103 xy_sep=lr_sep,
104 x_name='ligand',
105 y_name='receptor',
106 inplace=False,
107 complex_sep='_'
108 )
110 return self._handle_return(adata, lr_res, local_scores, key_added, obsm_added, inplace)
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/liana/method/sp/_SpatialBivariate.py:221, in SpatialBivariate.__call__(self, mdata, x_mod, y_mod, function_name, interactions, resource, resource_name, connectivity_key, mod_added, key_added, mask_negatives, add_categories, n_perms, seed, nz_threshold, x_use_raw, x_layer, x_transform, y_use_raw, y_layer, y_transform, x_name, y_name, complex_sep, xy_sep, remove_self_interactions, inplace, verbose)
218 assert_covered(entities, adata.var_names, verbose=verbose)
220 # Filter to only include the relevant features
--> 221 adata = adata[:, np.intersect1d(entities, adata.var.index)]
223 xy_stats = pd.DataFrame({'means': adata.X.mean(axis=0).A.flatten(),
224 'props': _get_props(adata.X)},
225 index=adata.var_names
226 ).reset_index().rename(columns={'index': 'gene'})
227 # join global stats to LRs from resource
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:1171, in AnnData.__getitem__(self, index)
1169 \"\"\"Returns a sliced view of the object.\"\"\"
1170 oidx, vidx = self._normalize_indices(index)
-> 1171 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:360, in AnnData.__init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
358 if not isinstance(X, AnnData):
359 raise ValueError(\"`X` has to be an AnnData object.\")
--> 360 self._init_as_view(X, oidx, vidx)
361 else:
362 self._init_as_actual(
363 X=X,
364 obs=obs,
(...)
376 filemode=filemode,
377 )
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:416, in AnnData._init_as_view(self, adata_ref, oidx, vidx)
414 # fix categories
415 uns = copy(adata_ref._uns)
--> 416 self._remove_unused_categories(adata_ref.obs, obs_sub, uns)
417 self._remove_unused_categories(adata_ref.var, var_sub, uns)
418 # set attributes
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:1181, in AnnData._remove_unused_categories(self, df_full, df_sub, uns)
1179 all_categories = df_full[k].cat.categories
1180 with pd.option_context(\"mode.chained_assignment\", None):
-> 1181 df_sub[k] = df_sub[k].cat.remove_unused_categories()
1182 # also correct the colors...
1183 color_key = f\"{k}_colors\"
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/views.py:78, in _SetItemMixin.__setitem__(self, idx, value)
71 else:
72 warnings.warn(
73 f\"Trying to modify attribute `.{self._view_args.attrname}` of view, \"
74 \"initializing view as actual.\",
75 ImplicitModificationWarning,
76 stacklevel=2,
77 )
---> 78 with view_update(*self._view_args) as container:
79 container[idx] = value
File ~/miniconda3/envs/liana/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
135 del self.args, self.kwds, self.func
136 try:
--> 137 return next(self.gen)
138 except StopIteration:
139 raise RuntimeError(\"generator didn't yield\") from None
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/views.py:52, in view_update(adata_view, attr_name, keys)
32 @contextmanager
33 def view_update(adata_view: AnnData, attr_name: str, keys: tuple[str, ...]):
34 \"\"\"Context manager for updating a view of an AnnData object.
35
36 Contains logic for \"actualizing\" a view. Yields the object to be modified in-place.
(...)
50 `adata.attr[key1][key2][keyn]...`
51 \"\"\"
---> 52 new = adata_view.copy()
53 attr = getattr(new, attr_name)
54 container = reduce(lambda d, k: d[k], keys, attr)
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:1582, in AnnData.copy(self, filename)
1576 if not self.isbacked:
1577 if self.is_view and self._has_X():
1578 # TODO: How do I unambiguously check if this is a copy?
1579 # Subsetting this way means we don’t have to have a view type
1580 # defined for the matrix, which is needed for some of the
1581 # current distributed backend. Specifically Dask.
-> 1582 return self._mutated_copy(
1583 X=_subset(self._adata_ref.X, (self._oidx, self._vidx)).copy()
1584 )
1585 else:
1586 return self._mutated_copy()
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:1527, in AnnData._mutated_copy(self, **kwargs)
1525 elif self.raw is not None:
1526 new[\"raw\"] = self.raw.copy()
-> 1527 return AnnData(**new)
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:362, in AnnData.__init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
360 self._init_as_view(X, oidx, vidx)
361 else:
--> 362 self._init_as_actual(
363 X=X,
364 obs=obs,
365 var=var,
366 uns=uns,
367 obsm=obsm,
368 varm=varm,
369 raw=raw,
370 layers=layers,
371 dtype=dtype,
372 shape=shape,
373 obsp=obsp,
374 varp=varp,
375 filename=filename,
376 filemode=filemode,
377 )
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:548, in AnnData._init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode)
544 # annotations
545 self._obs = _gen_dataframe(
546 obs, [\"obs_names\", \"row_names\"], source=source, attr=\"obs\", length=n_obs
547 )
--> 548 self._var = _gen_dataframe(
549 var, [\"var_names\", \"col_names\"], source=source, attr=\"var\", length=n_vars
550 )
552 # now we can verify if indices match!
553 for attr_name, x_name, idx in x_indices:
File ~/miniconda3/envs/liana/lib/python3.11/functools.py:909, in singledispatch.<locals>.wrapper(*args, **kw)
905 if not args:
906 raise TypeError(f'{funcname} requires at least '
907 '1 positional argument')
--> 909 return dispatch(args[0].__class__)(*args, **kw)
File ~/miniconda3/envs/liana/lib/python3.11/site-packages/anndata/_core/anndata.py:180, in _gen_dataframe_df(anno, index_names, source, attr, length)
170 @_gen_dataframe.register(pd.DataFrame)
171 def _gen_dataframe_df(
172 anno: pd.DataFrame,
(...)
177 length: int | None = None,
178 ):
179 if length is not None and length != len(anno):
--> 180 raise _mk_df_error(source, attr, length, len(anno))
181 anno = anno.copy(deep=False)
182 if not is_string_dtype(anno.index):
ValueError: Observations annot. `var` must have as many rows as `X` has columns (17353), but has 17506 rows."
}
Thank you very much for your help!
LIANA+ supports MuData as input, hence with appropriate data transformation and resource; any multi-omics dataset can be used.
To highlight this, I should include a tutorial on multi-omics data - e.g. CITE-seq and spatial CITE-seq
Hello, very powerful method to prediction CCC from scRNA-seq.
When I try to use the "li.mt.rank_aggregate", the CellChat run slowly.
Would you speed up the process based on GPU?
Thanks!
I should separate Global Moran's R from local, and implement more simple bivariate metrics, e.g. Geary's C
When running multiple misty models across slides it would be good to have a function to aggregate them.
Here is my code.
my_plot = li.pl.tileplot(adata = adata,
# NOTE: fill & label need to exist for both
# ligand_ and receptor_ columns
fill='means',
label='props',
label_fun=lambda x: f'{x:.2f}',
top_n=10,
orderby='cellphone_pvals',
orderby_ascending=True,
source_labels=["T cell", "NK cell",],
target_labels=["DC", "IFN-TAMs", "Inflam-TAMs", "Prolif-TAMs", "Reg-TAMs", "LA-TAMs", "Monocyte", "Neutrophils"],
# uns_key='cpdb_res' # NOTE: default is 'liana_res',
figure_size=(10, 6),
)
my_plot
as you see, The scales of the left and right colormaps are different.
Thank you for fix, in advance.
Currently, interactions are not sorted in li.pl.tileplot
, would be good to sort them by the orderby
argument.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.