Comments (7)
Ah, that's right! But this will be only if there are duplicates in adata.{obs,var}.columns
, not adata.{obs,var}.index
. I do really dislike that pandas lets you do this.
Having an object like this will probably give you problems later (like with IO: #884).
I'm a little split between making this work, or throwing an informative error here.
from anndata.
Thanks for the report.
Do you have an example of code that can trigger this, or at least a traceback?
Because k
is a key of the dataframe in question, I believe df[k]
should always be a Series
. Is there a case where this would fail?
df: pd.DataFrame
for k in df:
assert isinstance(df[k], pd.Series)
from anndata.
I think if the varnames or other names duplicate, this error will happen, since df[k] then will have a dataframe. I am not sure if it is not allowed to have dupliacated names, since this is a warning rather than bug.
from anndata.
Thanks. I will avoid having such case in the future. I would prefer an error and add the var_names_unique() function in the reading step.
from anndata.
This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!
from anndata.
I see this as well! I accidentally had a repeated column in adata.var
, and when I try to slice adata
as in adata[logical_array]
, I get
AttributeError: 'DataFrame' object has no attribute 'dtype'
from anndata.
This also happens with duplicated .obs
columns, which is a common mistake when updating adata.obs
with concatenation.
adata = sc.datasets.blobs()
new_col = adata.obs['blobs'].astype(int) + 1
adata.obs = pd.concat([adata.obs, new_col], axis=1)
adata[:, adata.var_names]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_189206/1060512404.py in ?()
1 new_col = adata.obs['blobs'].astype(int) + 1
2 adata.obs = pd.concat([adata.obs, new_col], axis=1)
3
----> 4 adata[:, adata.var_names]
/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, index)
1083 def __getitem__(self, index: Index) -> AnnData:
1084 """Returns a sliced view of the object."""
1085 oidx, vidx = self._normalize_indices(index)
-> 1086 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
265 ):
266 if asview:
267 if not isinstance(X, AnnData):
268 raise ValueError("`X` has to be an AnnData object.")
--> 269 self._init_as_view(X, oidx, vidx)
270 else:
271 self._init_as_actual(
272 X=X,
/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, adata_ref, oidx, vidx)
321 self._obsp = adata_ref.obsp._view(self, oidx)
322 self._varp = adata_ref.varp._view(self, vidx)
323 # fix categories
324 uns = copy(adata_ref._uns)
--> 325 self._remove_unused_categories(adata_ref.obs, obs_sub, uns)
326 self._remove_unused_categories(adata_ref.var, var_sub, uns)
327 # set attributes
328 self._obs = DataFrameView(obs_sub, view_args=(self, "obs"))
/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, df_full, df_sub, uns)
1088 def _remove_unused_categories(
1089 self, df_full: pd.DataFrame, df_sub: pd.DataFrame, uns: dict[str, Any]
1090 ):
1091 for k in df_full:
-> 1092 if not isinstance(df_full[k].dtype, pd.CategoricalDtype):
1093 continue
1094 all_categories = df_full[k].cat.categories
1095 with pd.option_context("mode.chained_assignment", None):
/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/pandas/core/generic.py in ?(self, name)
6295 and name not in self._accessors
6296 and self._info_axis._can_hold_identifiers_and_holds_name(name)
6297 ):
6298 return self[name]
-> 6299 return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'dtype'
from anndata.
Related Issues (20)
- TypeError: Can't implicitly convert non-string objects to strings HOT 4
- Error concatenating scRNA with visium dataset
- memory usage of concat HOT 3
- Using `Pint` for units HOT 9
- UMAP of gene subset of adata HOT 1
- Error using sc.pl.stacked_violin HOT 1
- NotImplementedError with concat_on_disk
- Issue when setting anndata.X to numpy array HOT 3
- Does anndata.raw can be automatically modified after define it?
- Make the call to `_check_uniqueness()` optional in `_init_as_actual()` (and `__init__()`) HOT 1
- `concat_on_disk` outer join
- `concat_on_disk` merge strategies are untested/not implemented
- Potentially flaky benchmark HOT 1
- Reading Anndata from only parts of h5ad file: Hack solution HOT 8
- 2D indexing (no “:”) in `sparse_dataset` is not lazy HOT 7
- Set of categories changes when slicing AnnData, but not when slicing DataFrame HOT 1
- Gpu failing because of `cp.full` with `np.nan`
- Add path parameter to write_zarr method HOT 7
- GPU writing HOT 1
- GPU concatenation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anndata.