theislab / cellrank Goto Github PK
View Code? Open in Web Editor NEWCellRank: dynamics from multi-view single-cell data
Home Page: https://cellrank.org
License: BSD 3-Clause "New" or "Revised" License
CellRank: dynamics from multi-view single-cell data
Home Page: https://cellrank.org
License: BSD 3-Clause "New" or "Revised" License
New name: CFLARE - Clustering and Filtering of Left and Right Eigenvectors.
At the moment, the x-axis seems to always be labelled with pseudotime
, see line 15 & 16 in our basic tutorial, even though latent_time
is specified. Can we change this, to make it more obvious what we really show on the x-axis?
When I call mc_fwd.compute_lin_probs(keys=['Ductal', 'Alpha, Beta'])
, I get the traceback
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-33-4d1908c3199d> in <module>
----> 1 mc_fwd.compute_lin_probs(keys=['Ductal', 'Alpha, Beta'])
2 mc_fwd.plot_lin_probs()
~/Projects/cellrank/cellrank/tools/_markov_chain.py in compute_lin_probs(self, keys, check_irred, norm_by_frequ)
814 else:
815 logg.debug(f"DEBUG: Combining recurrent classes according to `{keys}`")
--> 816 approx_rcs_ = self._prep_rc_classes(keys)
817 keys = list(approx_rcs_.cat.categories)
818
~/Projects/cellrank/cellrank/tools/_markov_chain.py in _prep_rc_classes(self, keys)
1259
1260 lin_colors = list(np.array(lin_colors)[cat_mask])
-> 1261 approx_rcs_temp.cat.reorder_categories(remaining_cat, inplace=True)
1262 self._lin_probs = Lineage(
1263 np.empty((1, len(lin_colors))),
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/accessor.py in f(self, *args, **kwargs)
91 def _create_delegator_method(name):
92 def f(self, *args, **kwargs):
---> 93 return self._delegate_method(name, *args, **kwargs)
94
95 f.__name__ = name
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in _delegate_method(self, name, *args, **kwargs)
2623
2624 method = getattr(self._parent, name)
-> 2625 res = method(*args, **kwargs)
2626 if res is not None:
2627 return Series(res, index=self._index, name=self._name)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in reorder_categories(self, new_categories, ordered, inplace)
1032 if set(self.dtype.categories) != set(new_categories):
1033 raise ValueError(
-> 1034 "items in new_categories are not the same as in " "old categories"
1035 )
1036 return self.set_categories(new_categories, ordered=ordered, inplace=inplace)
ValueError: items in new_categories are not the same as in old categories
> /Users/marius/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/arrays/categorical.py(1034)reorder_categories()
1032 if set(self.dtype.categories) != set(new_categories):
1033 raise ValueError(
-> 1034 "items in new_categories are not the same as in " "old categories"
1035 )
1036 return self.set_categories(new_categories, ordered=ordered, inplace=inplace)
If you fix this, please make sure that lineages still get assigned to the right color by simply running our tutorial.
I finally got my rpy2 to run and I'm now really keen to get your gene_trends to run using R's mgcv
. However, when I run
model = cr.ul.models.GamMGCVModel(adata, n_splines=7, sp=100)
cr.pl.gene_trends(adata, model=model, data_key='Ms',
genes=['Gng12', 'Pak3'],
time_key='latent_time',
weight_threshold=0.05, weight_scale=1, filter_data=True, dpi=25)
I get the error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-759d7e9f342a> in <module>
8 genes=['Gng12', 'Pak3'],
9 time_key='latent_time',
---> 10 weight_threshold=0.05, weight_scale=1, filter_data=True, dpi=25)
~/Projects/cellrank/cellrank/plotting/_gene_trend.py in gene_trends(adata, model, genes, lineages, data_key, final, start_lineage, end_lineage, conf_int, same_plot, hide_cells, perc, lineage_cmap, abs_prob_cmap, cell_color, color, cell_alpha, lineage_alpha, size, lw, show_cbar, margins, sharey, figsize, dpi, ncols, n_jobs, backend, ext, suptitle, save, dirname, plot_kwargs, show_progres_bar, **kwargs)
271 extractor=lambda modelss: {k: v for m in modelss for k, v in m.items()},
272 show_progress_bar=show_progres_bar,
--> 273 )(lineages, start_lineage, end_lineage, **kwargs)
274 logg.info(" Finish", time=start)
275
~/Projects/cellrank/cellrank/utils/_parallelize.py in wrapper(*args, **kwargs)
98 *((i, cs) if use_ixs else (cs,)), *args, **kwargs, queue=queue
99 )
--> 100 for i, cs in enumerate(collections)
101 )
102
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/parallel.py in __call__(self, iterable)
1002 # remaining jobs.
1003 self._iterating = False
-> 1004 if self.dispatch_one_batch(iterator):
1005 self._iterating = self._original_iterator is not None
1006
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
833 return False
834 else:
--> 835 self._dispatch(tasks)
836 return True
837
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/parallel.py in _dispatch(self, batch)
752 with self._lock:
753 job_idx = len(self._jobs)
--> 754 job = self._backend.apply_async(batch, callback=cb)
755 # A job can complete so quickly than its callback is
756 # called before we get here, causing self._jobs to
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
207 def apply_async(self, func, callback=None):
208 """Schedule a func to be run"""
--> 209 result = ImmediateResult(func)
210 if callback:
211 callback(result)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
588 # Don't delay the application, to avoid keeping the input
589 # arguments in memory
--> 590 self.results = batch()
591
592 def get(self):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/parallel.py in __call__(self)
254 with parallel_backend(self._backend, n_jobs=self._n_jobs):
255 return [func(*args, **kwargs)
--> 256 for func, args, kwargs in self.items]
257
258 def __len__(self):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/joblib/parallel.py in <listcomp>(.0)
254 with parallel_backend(self._backend, n_jobs=self._n_jobs):
255 return [func(*args, **kwargs)
--> 256 for func, args, kwargs in self.items]
257
258 def __len__(self):
~/Projects/cellrank/cellrank/plotting/_utils.py in _fit(genes, lineage_names, start_lineages, end_lineages, queue, **kwargs)
412 model = (
413 models[gene][ln]
--> 414 .prepare(gene, ln, start_lineage=sc, end_lineage=ec, **kwargs)
415 .fit()
416 )
~/Projects/cellrank/cellrank/utils/models/_models.py in fit(self, x, y, w, **kwargs)
803 pandas2ri.activate()
804
--> 805 df = pandas2ri.py2rpy(
806 pd.DataFrame(np.c_[self.x, self.y][use_ixs, :], columns=["x", "y"])
807 )
AttributeError: module 'rpy2.robjects.pandas2ri' has no attribute 'py2rpy'
Could this be a problem with my R or rpy2 version?
I'm using r-base
version 3.5.1 and rpy2
version 2.9.4.
Can we add a badge for the docs?
Capitalization of titles, labels, style (rcParams?)
I realised that our matrices can sometimes be ill-conditioned, see https://blogs.mathworks.com/cleve/2017/07/17/what-is-the-condition-number-of-a-matrix/ for an intro to condition numbers.
I our kernel class, I would add an option compupte_condn
in compute_transition_matix
, that uses np.linalg.cond
to compute the condition number and save it somewhere in the class. In logging level info
, I would print it out and if it's greater than 1e-15, I would issue a warning. There is currently no easy way to do this for sparse matrices, so we need to use T.A
for this. That's why I would make it optional and set compute_condn
to False
by default.
When I call cr.pl.cluster_fates(adata_t, mode='clustermap', cluster_key='clusters_fine', clusters=clusters)
I get
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-138-8017ed65d544> in <module>
1 clusters = ['Fev+ Beta', 'Fev+ Alpha', 'Fev+ Pyy','Fev+ Delta', 'Fev+ Epsilon']
----> 2 cr.pl.cluster_fates(adata_t, mode='clustermap', cluster_key='clusters_fine', clusters=clusters)
~/Projects/cellrank/cellrank/plotting/_cluster_fates.py in cluster_fates(adata, cluster_key, clusters, lineages, mode, final, basis, show_cbar, ncols, sharey, save, legend_kwargs, figsize, dpi, **kwargs)
501 fig = plot_violin_no_cluster_key() if cluster_key is None else plot_violin()
502 elif mode == "heatmap":
--> 503 fig = plot_heatmap()
504 else:
505 raise ValueError(
~/Projects/cellrank/cellrank/plotting/_cluster_fates.py in plot_heatmap()
374 ),
375 figsize=figsize,
--> 376 **kwargs,
377 )
378 g.ax_heatmap.set_xlabel(cluster_key)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/seaborn/matrix.py in clustermap(data, pivot_kws, method, metric, z_score, standard_scale, figsize, cbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, row_colors, col_colors, mask, **kwargs)
1299 row_cluster=row_cluster, col_cluster=col_cluster,
1300 row_linkage=row_linkage, col_linkage=col_linkage,
-> 1301 **kwargs)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/seaborn/matrix.py in plot(self, metric, method, colorbar_kws, row_cluster, col_cluster, row_linkage, col_linkage, **kws)
1136 yind = np.arange(self.data2d.shape[0])
1137
-> 1138 self.plot_colors(xind, yind, **kws)
1139 self.plot_matrix(colorbar_kws, xind, yind, **kws)
1140 return self
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/seaborn/matrix.py in plot_colors(self, xind, yind, **kws)
1065
1066 heatmap(matrix, cmap=cmap, cbar=False, ax=self.ax_row_colors,
-> 1067 xticklabels=row_color_labels, yticklabels=False, **kws)
1068
1069 # Adjust rotation of labels
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/seaborn/matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
526 if square:
527 ax.set_aspect("equal")
--> 528 plotter.plot(ax, cbar_ax, kwargs)
529 return ax
530
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/seaborn/matrix.py in plot(self, ax, cax, kws)
282 # Draw the heatmap
283 mesh = ax.pcolormesh(self.plot_data, vmin=self.vmin, vmax=self.vmax,
--> 284 cmap=self.cmap, **kws)
285
286 # Set the axis limits
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
1563 def inner(ax, *args, data=None, **kwargs):
1564 if data is None:
-> 1565 return func(ax, *map(sanitize_sequence, args), **kwargs)
1566
1567 bound = new_sig.bind(ax, *args, **kwargs)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/axes/_axes.py in pcolormesh(self, alpha, norm, cmap, vmin, vmax, shading, antialiased, *args, **kwargs)
6105 collection = mcoll.QuadMesh(Nx - 1, Ny - 1, coords,
6106 antialiased=antialiased, shading=shading,
-> 6107 **kwargs)
6108 collection.set_alpha(alpha)
6109 collection.set_array(C)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/collections.py in __init__(self, meshWidth, meshHeight, coordinates, antialiased, shading, **kwargs)
1940 def __init__(self, meshWidth, meshHeight, coordinates,
1941 antialiased=True, shading='flat', **kwargs):
-> 1942 Collection.__init__(self, **kwargs)
1943 self._meshWidth = meshWidth
1944 self._meshHeight = meshHeight
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/collections.py in __init__(self, edgecolors, facecolors, linewidths, linestyles, capstyle, joinstyle, antialiaseds, offsets, transOffset, norm, cmap, pickradius, hatch, urls, offset_position, zorder, **kwargs)
162
163 self._path_effects = None
--> 164 self.update(kwargs)
165 self._paths = None
166
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in update(self, props)
1004
1005 with cbook._setattr_cm(self, eventson=False):
-> 1006 ret = [_update_property(self, k, v) for k, v in props.items()]
1007
1008 if len(ret):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in <listcomp>(.0)
1004
1005 with cbook._setattr_cm(self, eventson=False):
-> 1006 ret = [_update_property(self, k, v) for k, v in props.items()]
1007
1008 if len(ret):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in _update_property(self, k, v)
1000 if not callable(func):
1001 raise AttributeError('{!r} object has no property {!r}'
-> 1002 .format(type(self).__name__, k))
1003 return func(v)
1004
AttributeError: 'QuadMesh' object has no property 'dendrogram_ratio'
> /Users/marius/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py(1002)_update_property()
1000 if not callable(func):
1001 raise AttributeError('{!r} object has no property {!r}'
-> 1002 .format(type(self).__name__, k))
1003 return func(v)
1004
The same code with mode heatmap
worked though.
Start slowly preparing to do this.
would be handy to be able to do this, in case I want to work with several mc objects.
When I call cr.pl.lineages(adata, cluster_key='clusters')
, I get the traceback
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-24-941df5202743> in <module>
1 cr.tl.lineages(adata)
----> 2 cr.pl.lineages(adata, cluster_key='clusters')
~/Projects/cellrank/cellrank/plotting/_utils.py in lineages(adata, lineages, final, cluster_key, mode, time_key, color_map, **kwargs)
91 time_key=time_key,
92 color_map=color_map,
---> 93 **kwargs,
94 )
95
~/Projects/cellrank/cellrank/tools/_markov_chain.py in plot_lin_probs(self, cluster_key, mode, time_key, color_map, **kwargs)
899 if mode == "embedding":
900 scv.pl.scatter(
--> 901 self._adata, color=color, title=titles, color_map=color_map, **kwargs
902 )
903 elif mode == "time":
~/Projects/scvelo/scvelo/plotting/scatter.py in scatter(adata, x, y, basis, vkey, color, use_raw, layer, color_map, colorbar, palette, size, alpha, linewidth, linecolor, perc, groups, sort_order, components, projection, legend_loc, legend_fontsize, legend_fontweight, xlabel, ylabel, title, fontsize, figsize, xlim, ylim, show_density, show_assignments, show_linear_fit, show_polyfit, rug, add_outline, add_text, add_text_pos, outline_width, outline_color, n_convolve, smooth, rescale_color, dpi, frameon, zorder, ncols, wspace, hspace, show, save, ax, **kwargs)
100 for key in mkeys: # multi_kwargs[key] = key[i] if is multikey (list with more than 1 element) else key
101 multi_kwargs[key] = eval('{0}[i * (len({0}) > i)] if is_list({0}) else {0}'.format(key))
--> 102 ax.append(scatter(adata, ax=pl.subplot(gs), **multi_kwargs, **scatter_kwargs, **kwargs))
103
104 savefig_or_show(dpi=dpi, save=save, show=show)
~/Projects/scvelo/scvelo/plotting/scatter.py in scatter(adata, x, y, basis, vkey, color, use_raw, layer, color_map, colorbar, palette, size, alpha, linewidth, linecolor, perc, groups, sort_order, components, projection, legend_loc, legend_fontsize, legend_fontweight, xlabel, ylabel, title, fontsize, figsize, xlim, ylim, show_density, show_assignments, show_linear_fit, show_polyfit, rug, add_outline, add_text, add_text_pos, outline_width, outline_color, n_convolve, smooth, rescale_color, dpi, frameon, zorder, ncols, wspace, hspace, show, save, ax, **kwargs)
312 logg.warn('Invalid color key. Using grey instead.')
313
--> 314 smp = ax.scatter(x, y, c=c, alpha=alpha, marker='.', zorder=zorder, **kwargs)
315
316 if add_outline:
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
1597 def inner(ax, *args, data=None, **kwargs):
1598 if data is None:
-> 1599 return func(ax, *map(sanitize_sequence, args), **kwargs)
1600
1601 bound = new_sig.bind(ax, *args, **kwargs)
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, **kwargs)
4498 )
4499 collection.set_transform(mtransforms.IdentityTransform())
-> 4500 collection.update(kwargs)
4501
4502 if colors is None:
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in update(self, props)
972
973 with cbook._setattr_cm(self, eventson=False):
--> 974 ret = [_update_property(self, k, v) for k, v in props.items()]
975
976 if len(ret):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in <listcomp>(.0)
972
973 with cbook._setattr_cm(self, eventson=False):
--> 974 ret = [_update_property(self, k, v) for k, v in props.items()]
975
976 if len(ret):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/matplotlib/artist.py in _update_property(self, k, v)
968 if not callable(func):
969 raise AttributeError('{!r} object has no property {!r}'
--> 970 .format(type(self).__name__, k))
971 return func(v)
972
AttributeError: 'PathCollection' object has no property 'lineages'
I'm fairly certain that's because the absorption probabilities are saved as a Lineage
object in adata.obsm
. I think you had changed this, but maybe something went wrong when we copied the repo? To make sure no other bugs were introduced, can you please run the tutorial and check that it works?
When I use cr.pl.cluster_fates
with mode='violin'
, the lineage names don't have the 'to/from' prefix that they usually have (see mc.plot_lin_probs
). Would be nice to add this prefix, depending on the direction.
Especially for functionality (i.e. pipeline) + plotting.
When calling g_bwd.plot_main_states(n_cells=30)
I get:
(Note that n_cells=None
works fine)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-64-afed7edbd7ff> in <module>
----> 1 g_bwd.plot_main_states(n_cells=30)
~/Projects/cellrank/cellrank/tools/estimators/_gppca.py in plot_main_states(self, n_cells, lineages, cluster_key, mode, time_key, same_plot, show_dp, title, cmap, **kwargs)
429 same_plot=same_plot,
430 title=title,
--> 431 **kwargs,
432 )
433
~/Projects/cellrank/cellrank/tools/estimators/_gppca.py in _plot_states(self, attr, error_msg, n_cells, same_plot, title, **kwargs)
774 )
775 except Exception as e:
--> 776 raise e
777 finally:
778 cleanup()
~/Projects/cellrank/cellrank/tools/estimators/_gppca.py in _plot_states(self, attr, error_msg, n_cells, same_plot, title, **kwargs)
771 color=keys,
772 title=list(_main_states.cat.categories) if title is None else title,
--> 773 **kwargs,
774 )
775 except Exception as e:
~/Projects/scvelo/scvelo/plotting/scatter.py in scatter(adata, basis, x, y, vkey, color, use_raw, layer, color_map, colorbar, palette, size, alpha, linewidth, linecolor, perc, groups, sort_order, components, projection, legend_loc, legend_loc_lines, legend_fontsize, legend_fontweight, xlabel, ylabel, title, fontsize, figsize, xlim, ylim, add_density, add_assignments, add_linfit, add_polyfit, add_rug, add_text, add_text_pos, add_outline, outline_width, outline_color, n_convolve, smooth, rescale_color, color_gradients, dpi, frameon, zorder, ncols, nrows, wspace, hspace, show, save, ax, **kwargs)
291 c = interpret_colorkey(adata, basis, color, perc, use_raw) # phase portrait: color=basis, layer=color
292 else:
--> 293 c = interpret_colorkey(adata, color, layer, perc, use_raw) # embedding, gene trend etc.
294
295 if c is not None and not isinstance(c, str) and not isinstance(c[0], str):
~/Projects/scvelo/scvelo/plotting/utils.py in interpret_colorkey(adata, c, layer, perc, use_raw)
442 elif np.any([obs_key in c for obs_key in adata.obs.keys()]):
443 obs = adata.obs[[key for key in adata.obs.keys() if not isinstance(adata.obs[key][0], str)]]
--> 444 c = obs.astype(np.float32).eval(c)
445 elif not is_color_like(c):
446 raise ValueError('color key is invalid! pass valid observation annotation or a gene name')
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors)
5696 else:
5697 # else, only a single dtype is given
-> 5698 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors)
5699 return self._constructor(new_data).__finalize__(self)
5700
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/internals/managers.py in astype(self, dtype, copy, errors)
580
581 def astype(self, dtype, copy: bool = False, errors: str = "raise"):
--> 582 return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
583
584 def convert(self, **kwargs):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/internals/managers.py in apply(self, f, filter, **kwargs)
440 applied = b.apply(f, **kwargs)
441 else:
--> 442 applied = getattr(b, f)(**kwargs)
443 result_blocks = _extend_blocks(applied, result_blocks)
444
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors)
605 if self.is_extension:
606 # TODO: Should we try/except this astype?
--> 607 values = self.values.astype(dtype)
608 else:
609 if issubclass(dtype.type, str):
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in astype(self, dtype, copy)
500 if is_integer_dtype(dtype) and self.isna().any():
501 raise ValueError("Cannot convert float NaN to integer")
--> 502 return np.array(self, dtype=dtype, copy=copy)
503
504 @cache_readonly
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/pandas/core/arrays/categorical.py in __array__(self, dtype)
1284 ret = take_1d(self.categories.values, self._codes)
1285 if dtype and not is_dtype_equal(dtype, self.categories.dtype):
-> 1286 return np.asarray(ret, dtype)
1287 if is_extension_array_dtype(ret):
1288 # When we're a Categorical[ExtensionArray], like Interval,
~/miniconda3/envs/cellrank/lib/python3.6/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
ValueError: could not convert string to float: 'Beta'
> /Users/marius/miniconda3/envs/cellrank/lib/python3.6/site-packages/numpy/core/_asarray.py(85)asarray()
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
To ensure harmonization between scvelo and cellrank, we should rename a couple of our attributes in cellrank. This is in particular relevant for scvelo's latent_time
and for the directed PAGA. I suggest the following changes:
adata
, write to final_states
and root_states
, rather than final_cells
and root_cells
final_states_cont
and root_states_cont
. This should be the aggregated probability that a cells belongs to a final/root state, no matter which one. (I guess we called this approx_rcs_probs
in MC, so we already have it there).The above two should be consistent between MC and GPCCA, we can talk about the details tonight. Also, I was thinking about how to make the naming more consistent between MC and GPCCA in general. I would suggest:
approx_rcs
to metastable_states
everywhere, i.e. compute_approx_rcs
goes to compute_metastable_states
in MC etc.metastable_states
to final_states
or root_states
in AnnDatametastable_states
to metastable_states_fwd
and metastable_states_bwd
in AnnData, and their restriction ('main states') to final_states
and root_states
in AnnDataWould be nice to be able to plot the PAGA with pie charts in the embedding, similar to how scvelo does it.
Use 'root cells'
and 'final cells'
instead.
I would generalize this a bit so that you can pass two series, and optionally, two color arrays associated with the categories. Potentially, you could use this: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.combine.html
I'm opening this up because I need such a utility function at the moment and rather than writing a second one, it would be nice to just have one general one. I would also rename the function and the variables inside it to not have any refernece to recurrent classes, to just keep it general.
A lot of functions in MarkovChain
are beginning with compute_
or plot_
. Also, we have a lot of attributes - maybe they should be read-only properties, since user shouldn't set them (and perhaps we can hide a few). Also, I'd give the functions more descriptive names (e.g. eigen
instead of just eig
)
For the former, I propose splitting the functions into modules, e.g.
mc_fwd = cr.tl.MarkovChain(...)
mc_fwd.compute.eigen()
...
mc_fwd.plot.eigen_embedding()
We could also have properties
module, e.g.
mc_fwd.props.G2M_score
@Marius1311 what do you think?
Do a proper typechecking using mypy
Additionally to the categorical root_states
and final_states
, we need continious root_states_cont
and final_states_cont
- basically, the probability of being a root/final states, irrespective of which one.
For CFLARE, this already exists, I guess it's currently called final_cells_probs
and root_cells_probs
, so here, it's only a matter of renaming. For GPCCA, we can get this simply by adding up our lineage probabilities corresponding to root/final states (not taking into account the rest
in case we did not redistribute, otherwise it won't work).
Once this is done, we can nicely interface with scvelo's latent time and directed paga. We're talking to Volker tomorrow, would be nice to have this ready by then so he can start with the interface on scvelo's side.
Currently, we use np.linalg.cond
to get the condition number, which works only on dense matrices.
Implement the approximation for sparse matrices as described in: https://github.com/sparse-condest/condest
Hi @Marius1311 ,
I saw a bug in the _counts() function of the tools/_cluster_fates module. I wanted to create a PR but I can't push to the repo so I think this is the best way to notify it.
The snippet of code is:
for name in cluster_names:
data = adata[adata.obs[cluster_key] == name].obsm[lin_key]
dim = data.shape[1]
index = np.random.randint(data.shape[0], size=n_samples)
l = [
np.random.choice(np.arange(dim), 1, p=np.atleast_1d(data[ind]))[0]
if ~np.isnan(data[ind]).any()
else 0
for ind in index
]
freq = [l.count(i) for i in np.arange(5)]
freq = [i if i > 0 else i + 1e-5 for i in freq]
d[name] = freq
The bug is freq = [l.count(i) for i in np.arange(5)]
should be freq = [l.count(i) for i in np.arange(dim)]
Best,
Juan.
We still need a figure for the index of our docs.
Write a custom comparator accounting for slight variations in sizes + do a proper requirements for tests.
The data we use in our tutorials, but also for our figures, will be a bit different from scvelo's data. For reproducibility, it would be nice to have a .datasets
module, that allows the user to conveniently download our datasets from the notebooks repo. Could you look into how scvelo does this and potentially adapt this to our needs?
Since we rely on the Lineage class, which cannot be reconstructed when loading the adata
object using sc.read
, I think we should have our own wrapper that will also optionally create dummy names, such as Lineage 1
if not present, as well as colors
Now that I know how to solve the issue, let's go back to the original implementation.
Hi guys! I'm trying to reproduce your CellRank pancreas tutorial, but when I try to plot the gene dynamics over latent time with
model = cr.ul.models.GamMGCVModel(adata, n_splines=7, sp=100)
cr.pl.gene_trends(adata, model=model, data_key='Ms', genes=['Gng12', 'Pak3'], time_key='latent_time', weight_threshold=0.05, weight_scale=1, filter_data=True, dpi=25)
it's stuck in the loading process (I stopped it after several hours). rpy2 is installed and working properly and switching the model with
from sklearn.gaussian_process.kernels import RBF
from sklearn.kernel_ridge import KernelRidge
model = cr.ul.models.SKLearnModel(adata, KernelRidge(kernel=RBF()))
gives me the same issue. Do you guys have any ideas?
For future reference: once new version of anndata
releases, we can use proper subsetting here of the Lineage
class thanks to scverse/anndata#335.
Add version, authors, email, etc. Check out versioneer and flit.
I often want to quickly visualise a small directed graph and I think your function would be nice for this. It would be great if we could add a way to just pass it a numpy array or sparse matrix, instead of having to save the matrix in an anndata object first. Do you agree, or do you think I should just use networkx straight away?
When I save a paga with pie charts with a .pdf
extension, then the legend keywords are ignored, i.e. they are just always plotted on data:
This is not the case when I save as e.g. .png
cellrank/cellrank/tools/_utils.py
Line 527 in 991b764
It's nowhere defined and raises a
NameError: name 'get_neighs_params' is not defined
Mode violoin does not work becasue of an indexing error. I call cr.pl.cluster_fates(adata, cluster_key='clusters', mode='violin')
and I get
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-22-6b9263a278de> in <module>
1 mc_fwd.compute_lin_probs(keys=['Ductal', 'Alpha, Beta, Epsilon'])
----> 2 cr.pl.cluster_fates(adata, cluster_key='clusters', mode='violin')
~/Projects/cellrank/cellrank/plotting/_cluster_fates.py in cluster_fates(adata, cluster_key, clusters, lineages, mode, final, show_cbar, ncols, sharey, save, legend_kwargs, figsize, dpi, **kwargs)
334 )
335 mask = np.array(mask, dtype=np.bool)
--> 336 data = np.array(adata.obsm[lk][mask, :][:, lin_names])
337 mean = np.nanmean(data, axis=0)
338 std = np.nanstd(data, axis=0) / np.sqrt(data.shape[0])
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
> /Users/marius/Projects/cellrank/cellrank/plotting/_cluster_fates.py(336)cluster_fates()
334 )
335 mask = np.array(mask, dtype=np.bool)
--> 336 data = np.array(adata.obsm[lk][mask, :][:, lin_names])
337 mean = np.nanmean(data, axis=0)
338 std = np.nanstd(data, axis=0) / np.sqrt(data.shape[0])
I think we should run a profiler over this to identify current bottlenecks - I have a feeling we can further speed this up.
In case the user already knows where his roots/endpoints are, it would be nice to have a convenient way to input this to the classe, as e.g. a dict with cell barcodes. @michalk8 , I'm not up to date whether this is already implemented or not?
For some reason, our references are not aligned, see https://cellrank.readthedocs.io/en/latest/references.html
It looks a bit odd at the moment.
Leander told me it may be more intutitive to have one function find_final
and another one find_root
instead of the combined root_final
- what do you think?
Getting both legends to somehow not overlap with the data is a real challenge. What's missing is the option to position the cellrank legend (final_states
) outside of the data, like the scvelo/scanpy legend (clusters
).
scvelo legend is set to right
, cellrank legend is left
. The problem is, it's not really outside the plotting area. best
does not give me anything better:
If at least I could turn the clusters
legend off, and if I were able to put the cellrank legend to the place there the scvelo legend currently is, that would already help me a lot.
Check for aperiodicity as well.
The assignment of colors to lineages isn't quite working yet. At the moment, colors are mixed around when I use keys
to compute lineages. See here to reproduce: https://github.com/theislab/cellrank_notebooks_dev/blob/melanoma/notebooks/analysis_pancreas/E15.5/ml_1_cellrank_testing.ipynb
I guess it's because I don't provide keys
in the right order?
I would be great to add mode='heatmap'
and mode='clustermap'
to cluster_fates
, to get e.g. output like this:
I wrote some dummy code that we can base this on: (I think cluster_fates
has most of this already)
def cluster_fates_heatmap(adata, cluster_key='louvain', clusters=None, title=None):
from seaborn import heatmap
if clusters is None:
clusters = adata.obs[cluster_key].cat.categories
lin_obj = adata.obsm['to_final_cells']
fev_heat = np.zeros((lin_obj.shape[1], len(clusters)))
for i, cl in enumerate(clusters):
mask = adata.obs[cluster_key] == cl
fev_heat[:, i] = np.mean(lin_obj[mask, :], axis=0)
fig, ax = plt.subplots()
heatmap(fev_heat, robust=True, xticklabels=clusters, yticklabels=lin_obj.names, annot=True, fmt='.2f', ax=ax)
if title is None:
figure_title = 'Averaged cluster fates'
else:
figure_title = title
ax.set_title(title)
plt.show()
In the latest version of anndata (and with the upcoming release), the neighbor graph will be stored in adata.obsp['distances']
and adata.obsp['connectivities']
, which we should already be accounting for now. Could, for instance, be done with a helper function that allows for both conventions
def get_neighs(adata, mode='distances'):
return adata.obsp[mode] if hasattr(adata, 'obsp') and mode in adata.obsp.keys() else adata.uns['neighbors'][mode]
It would be nice if .set_approx_rcs
would have an option add_to_existing: bool = True
, where
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.