Giter VIP home page Giter VIP logo

t-student-mixture-models's Introduction

t-Student-Mixture-Models

Build Status Documentation Status
Implementation of the paper: 'Robust mixture modelling using the t distribution', D. Peel and G. J. McLachlan.

  • Only Python >= 3.7 supported.
  • Code coverage: 76%.

Install with pip

$ python3 -m pip install smm --user

Install from source

$ git clone https://github.com/luiscarlosgph/t-Student-Mixture-Models.git
$ cd t-Student-Mixture-Models
$ python3 setup.py install --user

Usage

See example in src/example.py.

$ python3 src/example.py

Unit tests

To run the tests execute:

$ python3 setup.py test

Coverage

To run the coverage test:

$ python3 -m pip install coverage
$ python3 -m coverage run setup.py test
$ python3 -m coverage html

Then open 'htmlcov/index.html' and search for the line containing 'smm/smm.py'.

Documentation

See t-Student-Mixture-Models documentation.

Author

Luis Carlos Garcia-Peraza Herrera ([email protected]).

License

BSD 3-Clause License, see LICENSE file for more information.

t-student-mixture-models's People

Contributors

luiscarlosgph avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

t-student-mixture-models's Issues

ValueError: operands could not be broadcast together for "tied" covariance

Hi Luis,

I get the following error when trying to run SMM with "tied" covariance: "ValueError: operands could not be broadcast together ".

avg_X2 has shape (n_components, n_features) and avg_means2 has shape (n_features, n_features) in your implementation.

Could you take a look at this?

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
----> 7 SMM(n_classes, covariance_type='tied').fit(X).predict(X)

/opt/conda/lib/python3.8/site-packages/smm/smm.py in fit(self, X, y)
368 # Maximisation step
369 try:
--> 370 self.maximisation_step(X, responsibilities,
371 gammaweights

372 )

/opt/conda/lib/python3.8/site-packages/smm/smm.py in maximisation_step(self, X, responsibilities, gammaweights)
236 self.covariance_type
237 ]
--> 238 self.covars_ = covar_mstep_func(
239 X, zu, z_sum, self.means_, self.min_covar
240 )

/opt/conda/lib/python3.8/site-packages/smm/smm.py in _covar_mstep_tied(X, zu, z_sum, means, min_covar)
846 avg_X2 = np.dot(zu.T, X * X)
847 avg_means2 = np.dot(z_sum * means.T, means)
--> 848 out = avg_X2 - avg_means2
849 out /= z_sum.sum()
850 out.flat[::len(out) + 1] += min_covar

ValueError: operands could not be broadcast together with shapes (3,4) (4,4)`

problems with restricted covariances

I see recurring problems when trying covariances different than "full", that is diag, spherical etc.
The code to reproduce

from smm import SMM
from sklearn import datasets
data = datasets.load_iris()
X,y = data.data.astype('float32'), data.target.astype('int32')
#X = (X-X.mean(0))/X.std(0)
n_classes = len(np.unique(y))
SMM(n_classes,covariance_type='diag').fit(X).predict(X)

Errors: RuntimeWarning: Failed to converge after 1000 iterations, value is nan

Sort mixture weights of both true and estimated models in descending order

I ran the example, with the results below. I suggest sorting the true values weights of the estimated mixtures from highest to lowest so that it is easier to compare the estimated and true values. It is apparent that estimated parameters are close to the true ones but that the order of the mixture components is reversed.

==================================================
t-Student mixture optimized vs real parameters:

   Number of samples: 10000
   Component[s] in the mixture: 2

   Estimated mean[s]:
[[ 8.84202298 74.84667939  4.80967904]
 [36.87701105 11.82265983 72.64588165]]
   Correct mean[s]:
[[37 12 72]
 [ 9 75  5]]

   Estimated covariance[s]:
[[[111.88362606  26.33895845  45.52229972]
  [ 26.33895845 143.76095318  18.76622291]
  [ 45.52229972  18.76622291 259.41597647]]

 [[599.40578914  68.2534774   25.87431092]
  [ 68.2534774  464.63703569  70.41261736]
  [ 25.87431092  70.41261736 570.23933138]]]
   Correct covariance[s]:
[[[610.41534149  65.          17.        ]
  [ 65.         454.41534149  77.        ]
  [ 17.          77.         594.41534149]]

 [[114.48171983  26.          51.        ]
  [ 26.         142.48171983  19.        ]
  [ 51.          19.         270.48171983]]]

   Estimated degree[s] of freedom: [4.973859   2.83650923]
   Correct degree[s] of freedom: [3. 5.]

   Estimated weight[s]: [0.57232826 0.42754234]
   Correct weight[s]: [0.42756875 0.57243125]

==================================================

Robust + Examples

Hi!

I was wondering how robuts/major this project is.
I would need a python implementation to fit a mixture of student-t distributions to samples, giving me the "location", "scale" and "degrees of freedom" parameters.

I just tried running the code on a simple one component model but get the error:

RuntimeError: EM algorithm was never able to compute a valid likelihood given initial parameters. Try different init parameters (or increasing n_init) or check for degenerate data.

Furthermore I would be grateful if you could include a more comprehensive example (the one provided in the code as comment seems to be broken too).

E.g please correct me if I am wrong:

from sklearn import mixture
np.random.seed(1)
 g = mixture.SMM(n_components=2)

This example fails with an error immediately as SMM is not part of sklearn.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.