Giter VIP home page Giter VIP logo

Comments (7)

lmcinnes avatar lmcinnes commented on April 28, 2024

Hi, thanks for trying our the software. I had thought I had caught most instances when this could occur, but apparently not. What has happened, I believe, is that the algorithm has failed to find 15 nearest for at least one point; this breaks a lot of things. This may be data related, or possibly parameters related. I'll grab the data and see if I can see what is going wrong in this case. Thanks for the detailed report and the reproducer data -- it will make this process go much faster.

from umap.

ensonario avatar ensonario commented on April 28, 2024

Thanks for the response.

If I try to increase number of nearest points:

embedding = umap.UMAP(n_neighbors=25, min_dist=0.1,
                      metric='correlation').fit_transform(df.values)

I'm getting a new error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-8b76b22db9a0> in <module>()
      1 embedding = umap.UMAP(n_neighbors=25, min_dist=0.1,
----> 2                       metric='correlation').fit_transform(df.values)

~/venv3/lib/python3.6/site-packages/umap_learn-0.1.3-py3.6.egg/umap/umap_.py in fit_transform(self, X, y)
    790             Embedding of the training data in low-dimensional space.
    791         """
--> 792         self.fit(X)
    793         return self.embedding_

~/venv3/lib/python3.6/site-packages/umap_learn-0.1.3-py3.6.egg/umap/umap_.py in fit(self, X, y)
    757 
    758         graph = fuzzy_simplicial_set(X, self.n_neighbors,
--> 759                                      self._metric, self.metric_kwds)
    760 
    761         if self.n_edge_samples is None:

~/venv3/lib/python3.6/site-packages/scipy/sparse/base.py in multiply(self, other)
    297         """Point-wise multiplication by another matrix
    298         """
--> 299         return self.tocsr().multiply(other)
    300 
    301     def maximum(self, other):

~/venv3/lib/python3.6/site-packages/scipy/sparse/compressed.py in multiply(self, other)
    388                 return copy._mul_sparse_matrix(other)
    389             else:
--> 390                 raise ValueError("inconsistent shapes")
    391 
    392         # Assume other is a dense matrix/array, which produces a single-item

ValueError: inconsistent shapes

from umap.

lmcinnes avatar lmcinnes commented on April 28, 2024

I believe that is likely the same underlying error presenting in a different way when it arrives further downstream in the code. I'll have to look at the data and see what I've missed (as well as hopefully adding a few checks to provide more meaningful error statements when things like this do go wrong).

from umap.

lmcinnes avatar lmcinnes commented on April 28, 2024

It looks like the RP tree initialisation for NN-descent is carving off some outliers as singletons; this, in turn, makes NN-descent not work as well as it should. The end result is ... less than ideal. I should be able to fix this by randomly initialising any bad points. Hopefully I can get that done later today.

from umap.

lmcinnes avatar lmcinnes commented on April 28, 2024

This fixed the issue on your data or me locally. If you could rebuild and re-install from master and let me know if it resolves the issue for you as well that would be greatly appreciated.

from umap.

ensonario avatar ensonario commented on April 28, 2024

Thanks a lot! Let me test it ;)

from umap.

ensonario avatar ensonario commented on April 28, 2024

It seems the problem is solved! Thanks a lot for quick response and the fix! And of course, thanks for this amazing software.

from umap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.