Giter VIP home page Giter VIP logo

Comments (7)

ivirshup avatar ivirshup commented on July 18, 2024 1

Ah, that's right! But this will be only if there are duplicates in adata.{obs,var}.columns, not adata.{obs,var}.index. I do really dislike that pandas lets you do this.

Having an object like this will probably give you problems later (like with IO: #884).

I'm a little split between making this work, or throwing an informative error here.

from anndata.

ivirshup avatar ivirshup commented on July 18, 2024

Thanks for the report.

Do you have an example of code that can trigger this, or at least a traceback?

Because k is a key of the dataframe in question, I believe df[k] should always be a Series. Is there a case where this would fail?

df: pd.DataFrame

for k in df:
    assert isinstance(df[k], pd.Series)

from anndata.

HelloWorldLTY avatar HelloWorldLTY commented on July 18, 2024

I think if the varnames or other names duplicate, this error will happen, since df[k] then will have a dataframe. I am not sure if it is not allowed to have dupliacated names, since this is a warning rather than bug.

from anndata.

HelloWorldLTY avatar HelloWorldLTY commented on July 18, 2024

Thanks. I will avoid having such case in the future. I would prefer an error and add the var_names_unique() function in the reading step.

from anndata.

github-actions avatar github-actions commented on July 18, 2024

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

from anndata.

sjfleming avatar sjfleming commented on July 18, 2024

I see this as well! I accidentally had a repeated column in adata.var, and when I try to slice adata as in adata[logical_array], I get

AttributeError: 'DataFrame' object has no attribute 'dtype'

from anndata.

emdann avatar emdann commented on July 18, 2024

This also happens with duplicated .obs columns, which is a common mistake when updating adata.obs with concatenation.

adata = sc.datasets.blobs()
new_col = adata.obs['blobs'].astype(int) + 1
adata.obs = pd.concat([adata.obs, new_col], axis=1)
adata[:, adata.var_names]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_189206/1060512404.py in ?()
      1 new_col = adata.obs['blobs'].astype(int) + 1
      2 adata.obs = pd.concat([adata.obs, new_col], axis=1)
      3 
----> 4 adata[:, adata.var_names]

/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, index)
   1083     def __getitem__(self, index: Index) -> AnnData:
   1084         """Returns a sliced view of the object."""
   1085         oidx, vidx = self._normalize_indices(index)
-> 1086         return AnnData(self, oidx=oidx, vidx=vidx, asview=True)

/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
    265     ):
    266         if asview:
    267             if not isinstance(X, AnnData):
    268                 raise ValueError("`X` has to be an AnnData object.")
--> 269             self._init_as_view(X, oidx, vidx)
    270         else:
    271             self._init_as_actual(
    272                 X=X,

/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, adata_ref, oidx, vidx)
    321         self._obsp = adata_ref.obsp._view(self, oidx)
    322         self._varp = adata_ref.varp._view(self, vidx)
    323         # fix categories
    324         uns = copy(adata_ref._uns)
--> 325         self._remove_unused_categories(adata_ref.obs, obs_sub, uns)
    326         self._remove_unused_categories(adata_ref.var, var_sub, uns)
    327         # set attributes
    328         self._obs = DataFrameView(obs_sub, view_args=(self, "obs"))

/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/anndata/_core/anndata.py in ?(self, df_full, df_sub, uns)
   1088     def _remove_unused_categories(
   1089         self, df_full: pd.DataFrame, df_sub: pd.DataFrame, uns: dict[str, Any]
   1090     ):
   1091         for k in df_full:
-> 1092             if not isinstance(df_full[k].dtype, pd.CategoricalDtype):
   1093                 continue
   1094             all_categories = df_full[k].cat.categories
   1095             with pd.option_context("mode.chained_assignment", None):

/oak/stanford/groups/pritch/users/emma/miniforge3/envs/perturb-vs-tissue-env/lib/python3.10/site-packages/pandas/core/generic.py in ?(self, name)
   6295             and name not in self._accessors
   6296             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   6297         ):
   6298             return self[name]
-> 6299         return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'dtype'

from anndata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.