jacoblevine / phenograph Goto Github PK

View Code? Open in Web Editor NEW

132.0 132.0 68.0 184 KB

Subpopulation detection in high-dimensional single-cell data

Home Page: http://www.c2b2.columbia.edu/danapeerlab/html/phenograph.html

License: MIT License

Python 100.00%

phenograph's People

Contributors

Stargazers

Watchers

phenograph's Issues

Reproducibility problem (cluster number )

Hi,

I ran Phenograph 4 times using the same input matrix and I have noted that the results is different in term of output number of cluster.

How I can set the parameters for reproduce (or modify the seed) the clustering results of Phenograph (or to have at list a similar results)?

PhenoGraph Shiny App issue

I have run into a new issue when using the ShinyApp. I tried loading in some old PhenoGraph R data like usual, however this time I got this error message:

Warning: Error in load: empty (zero-byte) input file
Stack trace (innermost first):
67: load
66: observeEventHandler [/Library/Frameworks/R.framework/Versions/3.4/Resources/library/cytofkit/shiny/server.R#61]
2: shiny::runApp
1: cytofkitShinyAPP

Any ideas?

parallel computing issue

Hello,

I am trying to use the parallel computing function.
I called phenograph.cluster in ipython notebook. It gives the following error when use_parallel is set True:

Launching new cluster with 8 workers
Cluster launched successfully
---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<string> in <module>()
TypeError: 'CannedFunction' object is not callable

I also tried to set dview. But it gives another error:

Neighbors computed in 0.06748223304748535 seconds
---------------------------------------------------------------------------NameError                                 Traceback (most recent call last)<string> in <module>()
/envs/py34/lib/python3.4/site-packages/ipyparallel/util.py in _pull(keys)
    264         return [eval(key, globals()) for key in keys]
    265     else:
--> 266         return eval(keys, globals())
    267 
    268 @interactive
<string> in <module>()
NameError: name 's' is not defined

Any suggestion?

Thank you in advance!

Best,
Yang

Source code for modified Louvain?

Hello,

Thanks a lot for your work! As a python user, I have found it is way easier to call the Louvain code using your code base than to use their provider Python package (which is quite slow).

The phenograph readme mentions that you use a modified version of Louvain community detection. Any chance you can make the source code for this modified version available? I need to make a small change to make the results deterministic, but I am not a C++ programmer so it would really help to start with what you already have rather than try to replicate your described modification from scratch.

Is it possible to set a seed?

Is it possible to set a seed so that the results are reproducible?

https://github.com/BodenmillerGroup/histoCAT/issues/14

Is this repo deprecated?

It appears this repository is outdated compared to the fork https://github.com/dpeerlab/PhenoGraph . The fork contains more up to date installation information (addressing, for example, #20) and has an updated codebase.

@armMSKCC @hisplan is your fork now the preferred access point for this repository? In that case it would be best if this one contains a clear marker in the readme that it is out of date.

import phenograph ModuleNotFoundError: No module named 'phenograph'

Hello, I have installed the phenograph modula.
PhenoGraph 1.5.2
But when I import it, it occurs error. What's the problem?

python2 version?

Any plans for a python2 version? If not, any thoughts on the best way to integrate with python2 code? Thanks!

Permission denied at Louvein

Hi I think there's an error in .
When I run a test:

import numpy as np
import phenograph

tmp = np.random.rand(100,10)
communities, graph, Q = phenograph.cluster(tmp)

I get a permission error. It might be due for this: This happens if you are trying to open a file, but your path is a folder.

PermissionError Traceback (most recent call last)
in
6
7 tmp = np.random.rand(100,10)
----> 8 communities, graph, Q = phenograph.cluster(tmp)

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/cluster.py in cluster(data, clustering_algo, k, directed, prune, min_cluster_size, jaccard, primary_metric, n_jobs, q_tol, louvain_time_limit, nn_method, partition_type, resolution_parameter, n_iterations, use_weights, seed, **kargs)
348 communities, Q = "", ""
349 if clustering_algo == "louvain":
--> 350 communities, Q = run_louvain(graph, q_tol, louvain_time_limit)
351
352 elif clustering_algo == "leiden":

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/cluster.py in run_louvain(graph, q_tol, louvain_time_limit)
162 uid = uuid.uuid1().hex
163 graph2binary(uid, graph)
--> 164 communities, Q = runlouvain(uid, tol=q_tol, time_limit=louvain_time_limit)
165
166 # clean up

~/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/core.py in runlouvain(filename, max_runs, time_limit, tol)
259 filename + "_graph.weights",
260 ]
--> 261 p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
262 out, err = p.communicate()
263 # check for errors from convert

~/.conda/envs/pypandaenv1/lib/python3.8/subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
856 encoding=encoding, errors=errors)
857
--> 858 self._execute_child(args, executable, preexec_fn, close_fds,
859 pass_fds, cwd, env,
860 startupinfo, creationflags, shell,

~/.conda/envs/pypandaenv1/lib/python3.8/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
1704 if errno_num != 0:
1705 err_msg = os.strerror(errno_num)
-> 1706 raise child_exception_type(errno_num, err_msg, err_filename)
1707 raise child_exception_type(err_msg)
1708

PermissionError: [Errno 13] Permission denied: '/udd/remge/.conda/envs/pypandaenv1/lib/python3.8/site-packages/phenograph/louvain/linux-convert'

Windows requires Visual Studio installation?

We have noticed that PhenoGraph is only running once visual studio is installed. Unfortunately, it is not clear which packages or which dll are necessary to function.

Do you have any feedback on this problem? Is there an easy work-around?

We assume that it is somehow linked to lovain.

Louvain sources unavailable

Would it be possible to include the sources of the Louvain subdirectory? Also, what is the license of the modified Louvain sources? Will they be released under a free software license like the rest of your code?

As it stands your software cannot be built from source, so it cannot be packaged for GNU Guix, which we use at our institute.

Upload to PyPi

It would be more accessible if phenograph was available on PyPi, so that we could install it with

pip install --user phenograph

Having an issue saving heatmaps as PDFs

Hello!

I'm having an issue with the save heatmap as a PDF feature in Shinyapp. I have uploaded the R file on a different mac computer and been able to successfully download, but it will not work on our main desktop. Any ideas? Perhaps I need a newer version of R or PhenoGraph?

single cell RNAseq data

Does Phonograph only work for CYTOF or FLOW data? Have you tested it on single cell RNAseq data? Thank you

resolution parameter

In the R implementation, there is a resolution parameter for "Value of the resolution parameter, use a value above (below) 1.0 if you want to obtain a larger (smaller) number of communities (used only for leiden and louvian 2 or 3 methods)". I wonder if there is a parameter here with similar function? Thanks.

Phenograph seed no. and csv export

Hi,
I have been having some issues with the Windows versions 1.75 and 1.76.

Phenograph: setting a seed# does not always show up and goes to running phenograph immediately after selecting the number of neighbours.
Export gates as cvs rarely works and if it does it take a very long time.

I only use underscore in my folders name, so I hope it is not causing any issues related to this.

Any feedback is much appreciated!

New computer, new error message

I recently got the newest mac and ever since this I have had an issue opening the heat map in shiny app, I get the error: "cannot open file 'cytofkit_shinyAPP_marker_heatmap_plot.pdf" I have tried loading the data into the shiny app on other computers, and it has worked. This has also occured with three different data sets. Thanks!

run on linux

Hello,

I would like to run phenograph on linux but the executable files in louvain folder are not compatible.
I also tried compiling them from the source code downloaded here https://sites.google.com/site/findcommunities/.
However, the compiled files are not working, and the process is stacked at "Running Louvain modularity optimization". Am I using wrong code of this lib?
Thanks!

Best/ Yang

determine number of clusters

Hi,
Is there a way I can determine the optimal number of clusters in PhenoGraph? Thanks!

Syntax error during import

Syntax error occurs while loading library:

 File "/usr/local/lib/python3.6/site-packages/phenograph/classify.py", line 45
   print("Warning: iterative solver failed to converge in at least one case", flush=True)
                                                                                   ^
SyntaxError: invalid syntax

Using Python 2.7.15 on Mac OS

Occasionnal memory corruption in Louvain Community

There is a bug in the louvain community C++ implementation with causes PhenoGraph to occasionally fail to produce meaningful results.
(Though no actual SegFaults, due to peculiarity in Linux memory mapping)

up stream authors have been contacted.

GPU-boosted implementation of PhenoGraph

I'm writing to share my GPU-boosted implementation of PhenoGraph. Instead of using the CPU-bound libraries numpy, scipy.sparse, and sklearn as in the legacy implementation, I use the GPU-bound libraries cupy, cupyx.sparse, and cudf/cuml from NVIDIA's RAPIDS library to reduce execution time by orders of magnitude for large datasets. For especially large datasets or dataset compilations (~3 million cells x 50 features), the kNN search can be distributed to multiple GPUs, if they are available. For a synthetic dataset of 1 million cells x 30 features, the CPU implementation executes in ~6 hours, whereas the GPU implementation run on a single V100 GPU executes in ~40 seconds (~500-fold speed-up):

Modularity is comparable between GPU and CPU implementations:

Please feel free to link to the repo if interested: https://gitlab.com/eburling/grapheno

Thanks and sorry for the spam! I hope the community finds it useful.

Parallel computation of Jaccard index uses up server resources. Proposed solution.

When I run the algorithm and set n_jobs I find that there are more processes than expected. I believe that this is why this happens:
File: PhenoGraph/phenograph/core.py
135: with closing(Pool()) as pool: # replace with: with closing(Pool(n_jobs)) as pool:
136: jaccard_values = pool.starmap(calc_jaccard, zip(range(n), repeat(idx)))

I can correct this in my clone of the library. Is there a reason why you can't set the maximum number of jobs for this process pool?

jacoblevine / phenograph Goto Github PK

phenograph's People

Contributors

Stargazers

Watchers

Forkers

phenograph's Issues

Recommend Projects

Recommend Topics

Recommend Org