Giter VIP home page Giter VIP logo

jumpmethod's Introduction

The Implementation of the Jump Clustering Algorithm on Python

Motivation

This repository contains an attempt to replicate some of the results that were achieved in

Sugar, C. A., & James, G. M. (2003). Finding the number of clusters in a dataset: An information-theoretic approach. Journal of the American Statistical Association, 98(463), 750โ€“763. JOUR.

link

In particular, it is attempted to replicate the results that are illustrated in Figure 4.

Another reason why this repository was created is that the implementation of the Jump Method algorithm on Python cannot be found easily on the internet (I have not found any).

Files

In the repository, one may find an article itself with the illustration of the results that is needed to replicate (article folder).

Also, this repository contains a Python class jumpmethod.py which in turn contains two functions: Distortions and Jumps that calculate vectors of distortions and jumps for a given number of clusters to check.

Finally, there is also a JupyterNotebook Simulations (Figure 4).ipynb which was created for illustrative explanations why and how the replication can be achieved.

TODO:

  1. Add the Transformed distortion curves to the class (easy: this is just a cumulate of Jumps).
  2. Add the discription of the algorithm to the README.
  3. Replicate the Iris results (SJ, p. 12).
  4. Replicate the bootstrap results (SJ, pp. 13--15).
  5. Performance check (may be improvements).

jumpmethod's People

Contributors

vdyashin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.