Giter VIP home page Giter VIP logo

xpysom-dask's Introduction

XPySom-dask

Self Organizing Maps with Dask Support

XPySom-dask is a dask version of the original XPySom project. The original project is a batched version of SOM algorithm, it can be easily transformed into a distributed version using Dask.

Installation

You can download XPySom-dask from PyPi:

pip install xpysom-dask

By default, dependencies for GPU execution are not downloaded. You can also specify a CUDA version to automatically download also those requirements. For example, for CUDA Toolkit 10.2 you would write:

pip install xpysom-dask[cuda102]

Alternatively, you can manually install XPySom-dask. Download XPySom to a directory of your choice and use the setup script:

pip3 install git+https://github.com/jcfaracco/xpysom-dask.git

How to use it

The module interface is similar to MiniSom. In the following only the basics of the usage are reported, for an overview of all the features, please refer to the original MiniSom examples you can refer to: https://github.com/JustGlowing/minisom/tree/master/examples (you can find the same examples also in this repository but they have not been updated yet).

In order to use XPySom you need your data organized as a Numpy matrix where each row corresponds to an observation or as list of lists like the following:

chunks = (4, 2)
data = [[ 0.80,  0.55,  0.22,  0.03],
        [ 0.82,  0.50,  0.23,  0.03],
        [ 0.80,  0.54,  0.22,  0.03],
        [ 0.80,  0.53,  0.26,  0.03],
        [ 0.79,  0.56,  0.22,  0.03],
        [ 0.75,  0.60,  0.25,  0.03],
        [ 0.77,  0.59,  0.22,  0.03]]      

Then you can train XPySom just as follows:

from xpysom-dask import XPySom

import dask.array as da

from dask.distributed import Client, LocalCluster

client = Client(LocalCluster())

dask_data = da.from_array(data, chunks=chunks)

som = XPySom(6, 6, 4, sigma=0.3, learning_rate=0.5, use_dask=True, chunks=chunks) # initialization of 6x6 SOM
som.train(dask_data, 100) # trains the SOM with 100 iterations

You can obtain the position of the winning neuron on the map for a given sample as follows:

som.winner(data[0])

Differences with MiniSom

  • The batch SOM algorithm is used (instead of the online used in MiniSom). Therefore, use only train to train the SOM, train_random and train_batch are not present.
  • decay_function input parameter is no longer a function but one of 'linear', 'exponential', 'asymptotic'. As a consequence of this change, sigmaN and learning_rateN have been added as input parameters to represent the values at the last iteration.
  • New input parameter std_coeff, used to calculate gaussian exponent denominator d = 2*std_coeff**2*sigma**2. Default value is 0.5 (as in Somoclu, which is different from MiniSom original value sqrt(pi)).
  • New input parameter xp (default = cupy module). Back-end to use for computations.
  • New input parameter n_parallel to set size of the mini-batch (how many input samples to elaborate at a time).
  • Hexagonal grid support is experimental and is significantly slower than rectangular grid.

Authors

Copyright (C) 2021 Julio Faracco

xpysom-dask's People

Contributors

austint avatar feiyao-edinburgh avatar fgiobergia avatar jcfaracco avatar justglowing avatar manciukic avatar mpoegel avatar ph0ngp avatar sylfrena avatar tomage avatar tomcucinotta avatar vezeli avatar wei-zhang-thz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

xpysom-dask's Issues

TypeError: __call__() got an unexpected keyword argument 'xp' when training

Hi, I encounter an issue even when running the example script you provided. I'm using the latest dask with only CPU supported.

`chunks = (4, 2)
data = [[ 0.80, 0.55, 0.22, 0.03],
[ 0.82, 0.50, 0.23, 0.03],
[ 0.80, 0.54, 0.22, 0.03],
[ 0.80, 0.53, 0.26, 0.03],
[ 0.79, 0.56, 0.22, 0.03],
[ 0.75, 0.60, 0.25, 0.03],
[ 0.77, 0.59, 0.22, 0.03]]
dask_data = da.from_array(data, chunks=chunks)

som = XPySom(6, 6, 4, sigma=0.3, learning_rate=0.5) # initialization of 6x6 SOM
som.train(dask_data, 100) # trains the SOM with 100 iterations`

and here is the error shown

`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[37], line 12
9 dask_data = da.from_array(data, chunks=chunks)
11 som = XPySom(6, 6, 4, sigma=0.3, learning_rate=0.5) # initialization of 6x6 SOM
---> 12 som.train(dask_data, 100) # trains the SOM with 100 iterations

File /glade/work/zxhua/conda-envs/metview_env/lib/python3.8/site-packages/xpysom_dask/xpysom.py:566, in XPySom.train(self, data, num_epochs, iter_beg, iter_end, verbose)
563 if end > len(data):
564 end = len(data)
--> 566 a, b = self._update(data_gpu[start:end], weights_gpu, eta, sig)
568 numerator_gpu += a
569 denominator_gpu += b

File /glade/work/zxhua/conda-envs/metview_env/lib/python3.8/site-packages/xpysom_dask/xpysom.py:432, in XPySom._update(self, x_gpu, weights_gpu, eta, sig)
421 """Updates the numerator and denominator accumulators.
422
423 Parameters
(...)
428 Iteration index
429 """
430 weights_gpu = self.xp.asarray(weights_gpu)
--> 432 wins = self._winner(x_gpu, weights_gpu)
434 g_gpu = self.neighborhood(wins, sig, xp=self.xp)*eta
436 sum_g_gpu = self.xp.sum(g_gpu, axis=0)

File /glade/work/zxhua/conda-envs/metview_env/lib/python3.8/site-packages/xpysom_dask/xpysom.py:415, in XPySom._winner(self, x_gpu, winners_gpu)
412 if len(x_gpu.shape) == 1:
413 x_gpu = self.xp.expand_dims(x_gpu, axis=0)
--> 415 self._activate(x_gpu, winners_gpu)
416 raveled_idxs = self._activation_map_gpu.argmin(axis=1)
417 return (self._unravel_precomputed[0][raveled_idxs], self._unravel_precomputed[1][raveled_idxs])

File /glade/work/zxhua/conda-envs/metview_env/lib/python3.8/site-packages/xpysom_dask/xpysom.py:343, in XPySom._activate(self, x_gpu, weights_gpu)
340 x_gpu = self.xp.expand_dims(x_gpu, axis=0)
342 if self._sq_weights_gpu is not None:
--> 343 self._activation_map_gpu = self._activation_distance(
344 x_gpu,
345 weights_gpu,
346 self._sq_weights_gpu,
347 xp=self.xp
348 )
349 else:
350 self._activation_map_gpu = self._activation_distance(
351 x_gpu,
352 weights_gpu,
353 xp=self.xp
354 )

TypeError: call() got an unexpected keyword argument 'xp'`

Any suggestion in how to fixing this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.