facebookresearch / poincaremaps Goto Github PK

The need to understand cell developmental processes has spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry which is not an optimal choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method harnessing the power of hyperbolic geometry into the realm of single-cell data analysis.

License: Other

Python 0.28% Jupyter Notebook 48.83% HTML 50.88%

poincaremaps's Introduction

PoincareMaps

Poincare maps recover continuous hierarchies in single-cell data.

POC: Anna Klimovskaia ([email protected])

Dependecies

python3.7 anaconda (sklearn, numpy, pandas, scipy) seaborn

Pytorch (pytorch 1.7.1): https://pytorch.org/get-started/locally/

To replicate our experiments

Embedding

python main.py --dset ToggleSwitch       --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 1.0 --pca 0  --root root
python main.py --dset MyeloidProgenitors --batchsize -1 --cuda 1 --knn 30 --gamma 2.0 --sigma 2.0 --pca 0  --root root
python main.py --dset krumsiek11_blobs   --batchsize -1 --cuda 1 --knn 30 --gamma 2.0 --sigma 1.0 --pca 20 --root root

python main.py --dset Olsson   			 --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 1.0 --pca 20 --root HSPC-1
python main.py --dset Paul               --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 1.0 --pca 20 --root root
python main.py --dset Moignard2015       --batchsize -1 --cuda 1 --knn 30 --gamma 1.0 --sigma 2.0 --pca 0  --root PS
python main.py --dset Planaria           --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 2.0 --pca 0 --root neoblast\ 1

python main.py --dset MyeloidProgenitors --batchsize -1 --cuda 1 --knn 30 --gamma 2.0 --sigma 2.0 --pca 0  --root root
python main.py --dset Olsson   			 --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 1.0 --pca 20 --root HSPC-1
python main.py --dset Planaria           --batchsize -1 --cuda 1 --knn 15 --gamma 2.0 --sigma 2.0 --pca 0 --root neoblast\ 1

Prediction

python decoder.py --dset Planaria --cuda 1 --method poincare
python decoder.py --dset Planaria --cuda 1 --method UMAP
python decoder.py --dset Planaria --cuda 1 --method ForceAtlas2

Structure of the repository

Folder datasets contains datasets used in the study.

Folder results contains Poincaré map coordinates.

Folder decoder contains weights of the pretrained decoder network.

Folder predictions contains coordinates of sampled (interpolated) points.

Folder benchmarks contains visualization of benchmark embeddings.

License

PoincareMaps is Attribution-NonCommercial 4.0 International licensed, as found in the LICENSE file.

poincaremaps's People

Contributors

Stargazers

Watchers

poincaremaps's Issues

Does PoincareMaps support more than 2 dimensions embedding?

In the function "compute_poincare_maps" in the line 66 in main.py, you have hard coded the dimension as 2. Could I change it the other values such as 3 dimensions? Thanks.

pytorch version

Hi, thanks for releasing this method, it looks very useful. I'm trying to use it on our data. I am not an experienced python user so I'm having trouble getting all the dependencies in place.

What version of pytorch are you using, please?

I am getting the following error when running main.py:

Traceback (most recent call last):
File "main.py", line 179, in
color_dict=color_dict)
File "/home/rstudio/ms_project/PoincareMaps/train.py", line 41, in train
loss = model.lossfn(model(inputs), targets)
File "/home/rstudio/miniconda3/envs/ms_py/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/home/rstudio/ms_project/PoincareMaps/model.py", line 131, in forward
dists = self.dist()(embs_inputs, embs_all).squeeze(-1)
File "/home/rstudio/miniconda3/envs/ms_py/lib/python3.7/site-packages/torch/autograd/function.py", line 145, in __call__
"Legacy autograd function with non-static forward method is deprecated. "
RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

python 3.7.7 on ubuntu 18.04 LTS

Thanks

results of pip freeze

anndata==0.7.3
annoy==1.16.3
certifi==2020.4.5.2
cffi==1.14.0
cycler==0.10.0
Cython==0.29.20
decorator==4.4.2
fastdtw==0.3.2
get-version==2.1
h5py==2.10.0
imageio==2.8.0
importlib-metadata==1.6.1
joblib==0.15.1
kiwisolver==1.2.0
legacy-api-wrap==1.2
leidenalg==0.8.0
llvmlite==0.33.0
matplotlib==3.1.3
mkl-fft==1.0.15
mkl-random==1.1.0
mkl-service==2.3.0
natsort==7.0.1
networkx==2.4
numba==0.50.0
numexpr==2.7.1
numpy==1.18.1
packaging==20.4
pandas==1.0.3
patsy==0.5.1
Pillow==7.1.2
pycairo==1.18.0
pycparser==2.20
pyparsing==2.4.7
python-dateutil==2.8.1
python-igraph==0.7.1.post7
pytz==2020.1
PyWavelets==1.1.1
scanpy==1.5.1
scikit-image==0.17.2
scikit-learn==0.23.1
scipy==1.4.1
scrublet==0.2.1
seaborn==0.10.1
setuptools-scm==4.1.2
six==1.15.0
statsmodels==0.11.1
tables==3.6.1
tbb==2020.0.133
threadpoolctl==2.1.0
tifffile==2020.6.3
torch==1.5.0
tornado==6.0.4
tqdm==4.46.1
umap-learn==0.4.4
zipp==3.1.0

Runtime error with autograd function / module requirements issue?

Greetings and thank you for your work on this novel approach! We are pretty excited to try it out on our own data, and are currently trying to reproduce your results first to make sure we understand how to use the approach.

Upon trying to run one of the embeddings, we run into a problem. Note that the following is from a local install within a Win10/Conda environment. I've provided a package list below as well.

Since the error is RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method, I'll venture to say that I'm probably not using the precise version of PyTorch that PoincareMaps is expecting?

Thank you for advice/guidance.

(poincare) C:\data\PoincareMaps>python main.py --dset MyeloidProgenitors --batchsize -1 --cuda 1 --knn 30 --gamma 2.0 --sigma 2.0 --pca 0  --root root
Computing laplacian...
Laplacian computed in 0.03 sec
Computing RFA...
RFA computed in 0.02 sec
batchsize =  64
Starting training...
  0%|                                                  | 0/5000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "main.py", line 179, in <module>
    color_dict=color_dict)
  File "C:\data\PoincareMaps\train.py", line 41, in train
    loss = model.lossfn(model(inputs), targets)
  File "C:\Users\cartaij\AppData\Local\Continuum\miniconda3\envs\poincare\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\data\PoincareMaps\model.py", line 131, in forward
    dists = self.dist()(embs_inputs, embs_all).squeeze(-1)
  File "C:\Users\cartaij\AppData\Local\Continuum\miniconda3\envs\poincare\lib\site-packages\torch\autograd\function.py", line 145, in __call__
    "Legacy autograd function with non-static forward method is deprecated. "
RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

# Name                    Version                   Build  Channel
blas                      1.0                         mkl
ca-certificates           2020.1.1                      0
certifi                   2020.4.5.1               py37_0
cudatoolkit               10.2.89              h74a9793_1
cycler                    0.10.0                   py37_0
freetype                  2.9.1                ha9979f8_1
icc_rt                    2019.0.0             h0cc432a_1
icu                       58.2                 ha925a31_3
intel-openmp              2020.1                      216
joblib                    0.15.1                     py_0
jpeg                      9b                   hb83a4c4_2
kiwisolver                1.2.0            py37h74a9793_0
libpng                    1.6.37               h2a8f88b_0
libtiff                   4.1.0                h56a325e_1
lz4-c                     1.9.2                h62dcd97_0
matplotlib                3.1.3                    py37_0
matplotlib-base           3.1.3            py37h64f37c6_0
mkl                       2020.1                      216
mkl-service               2.3.0            py37hb782905_0
mkl_fft                   1.0.15           py37h14836fe_0
mkl_random                1.1.1            py37h47e9c7a_0
ninja                     1.9.0            py37h74a9793_0
numpy                     1.18.1           py37h93ca92e_0
numpy-base                1.18.1           py37hc3f5095_1
olefile                   0.46                     py37_0
openssl                   1.1.1g               he774522_0
pandas                    1.0.3            py37h47e9c7a_0
pillow                    7.1.2            py37hcc1f983_0
pip                       20.0.2                   py37_3
pyparsing                 2.4.7                      py_0
pyqt                      5.9.2            py37h6538335_2
python                    3.7.7                h81c818b_4
python-dateutil           2.8.1                      py_0
pytorch                   1.5.0           py3.7_cuda102_cudnn7_0    pytorch
pytz                      2020.1                     py_0
qt                        5.9.7            vc14h73c81de_0
scikit-learn              0.22.1           py37h6288b17_0
scipy                     1.4.1            py37h9439919_0
seaborn                   0.10.1                     py_0
setuptools                47.1.1                   py37_0
sip                       4.19.8           py37h6538335_0
six                       1.15.0                     py_0
sqlite                    3.31.1               h2a8f88b_1
tk                        8.6.8                hfa6e2cd_0
torchvision               0.6.0                py37_cu102    pytorch
tornado                   6.0.4            py37he774522_1
tqdm                      4.46.0                     py_0
vc                        14.1                 h0510ff6_4
vs2015_runtime            14.16.27012          hf0eaf9b_2
wheel                     0.34.2                   py37_0
wincertstore              0.2                      py37_0
xz                        5.2.5                h62dcd97_0
zlib                      1.2.11               h62dcd97_4
zstd                      1.4.4                ha9fde0e_3

R wrapper?

Are you thinking about an R-Wrapper for your approach? This could be very useful since a large proportion of the single cell community uses R-packages/pipelines for their analysis.

Rank method used for high dimensional embeddings

Hi,
I think there's a typo remaining in the embedding quality evaluation file. In the paper (https://doi.org/10.1038/s41467-020-16822-4), you state that "To estimate distances in the high-dimensional space δ ij , we use geodesic distances estimated as the length of a
shortest-path in a k-nearest neighbors graph.", which is done by your method:

PoincareMaps/embedding_quality_score.py

Line 26 in 7eb2546

def get_rank_high(data, k_neighbours = 15, knn_sym=True):

but it is not used in the final metric computation

PoincareMaps/embedding_quality_score.py

Lines 180 to 184 in 7eb2546

 Rank_high = get_ranking(D_high) 

 print('Rank high') 

 Rank_low = get_ranking(D_low) 

 print('Rank low')

It would be great to hear from you about that,
Thanks for your good work!
Best

IndexError: index 0 is out of bounds for axis 0 with size 0

Hello,

I am trying out your tool but am running into the same error, randomly after >400 epochs. Would you let me know if you have an idea what s going on?

Thanks a lot,
Jo

python3 main.py --dset data --path data/ --batchsize -1 --cuda 0 --knn 15 --gamma 2.0 --sigma 1.0 --pca 20 --labels 1 --mode features --debugplot 10
Computing laplacian...
Laplacian computed in 0.01 sec
Computing RFA...
RFA computed in 0.00 sec
batchsize = 20
Starting training...
loss: 0.29734: 10%|██▍ | 476/5000 [00:32<04:57, 15.20it/s]
Stopped at epoch 480
PM computed in 32.75 sec
loss = 2.973e-01
time = 0.548 min
Traceback (most recent call last):
File "main.py", line 207, in
root_hat = poincare_root(opt, labels, features)
File "/path/model.py", line 40, in poincare_root
return head_idx[0]
IndexError: index 0 is out of bounds for axis 0 with size 0
loss: 0.29734: 10%|██▍ | 476/5000 [00:33<05:16, 14.31it/s]

	Rank_high = get_ranking(D_high)
	print('Rank high')

	Rank_low = get_ranking(D_low)
	print('Rank low')