Giter VIP home page Giter VIP logo

Comments (8)

falexwolf avatar falexwolf commented on July 18, 2024

Double slicing in backed mode (taking a view of a view in backed mode) is not allowed and we now throw an error: 2b622f4

Why don't you do adata_backed[0:2, 0:2]?

from anndata.

bkmartinjr avatar bkmartinjr commented on July 18, 2024

We are using boolean slicing to allow for complex filtering, and currently double slicing with non-integer or non-slice selectors is not allowed.

In other words, this throws an error:

obs_selector=np.array([True, False, ...])
vars_selector=np.array(False, True, ...])
adata_backed[obs_selector, vars_selector]

And the error message informs that we should try double slicing.

from anndata.

falexwolf avatar falexwolf commented on July 18, 2024

OK! Right, double-slicing in memory mode works fine and is currently the only (not nice) way to get submatrices from boolean vectors. In backed-mode, it's quite a bit trickier.

In any-case: if you need this, I'll implement the functionality, maybe even tonight. Then no double slicing is necessary anymore.

from anndata.

bkmartinjr avatar bkmartinjr commented on July 18, 2024

At the moment, we can work around it and don't see a need for you to urgently implement. I ran into the bug because I was benchmarking to determine optimal ways to use anndata in cellxgene. I think the best path would be for us to ship our "MVP", and then have a chat with you about performance. Backed mode will either be useful, or not, based upon that. Seem reasonable?

from anndata.

falexwolf avatar falexwolf commented on July 18, 2024

Sounds very reasonable! Let's discuss!

Meanwhile, I think the submatrix extraction via slicing should be relatively straightforward to get via np.ix_() applied to the data matrix and everything else stays as is. As we discussed this already ages ago and you worked quite a bit on the indexing at the time, @flying-sheep, any bandwidth for doing this? It's essentially only making sure that the index normalization produces non-slices and handles pd.Index objects appropriately.

from anndata.

ivirshup avatar ivirshup commented on July 18, 2024

Progress was definitely made here, but I'm not sure this issue is totally solved. Double "fancy" indexing over multiple axes isn't supported by h5py datasets. This does work with backed anndata sparse matrices (at least on master).

Side note: It might be possible for zarr dense arrays via get_orthogonal_selection.

from anndata.

github-actions avatar github-actions commented on July 18, 2024

This issue has been automatically marked as stale because it has not had recent activity.
Please add a comment if you want to keep the issue open. Thank you for your contributions!

from anndata.

flying-sheep avatar flying-sheep commented on July 18, 2024

We throw a meaningful error here and if we ever start supporting it, we’ll announce it.

def test_double_index(adata, backing_h5ad):
adata.filename = backing_h5ad
with pytest.raises(ValueError):
# no view of view of backed object currently
adata[:2][:, 0]

from anndata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.