Giter VIP home page Giter VIP logo

ml-stats's Introduction

ML-stats

A Python package with some statistical tools for evaluating Machine Learning models.

Table of contents

autoauto- [ML-stats](#ml-stats)auto- [Table of contents](#table-of-contents)auto- [Installation](#installation)auto- [Use the package](#use-the-package)auto- [Current contents of the package](#current-contents-of-the-package)autoauto

Installation

To install the package, simply download the files for now, add the source to your path in python, and import the necessary functionality.

Use the package

To do a statistical test, you need to:

  1. Construct a matrix of the experimental results. The rows are the datasets/blocks, the columns are the methods/groups, and the values of the matrix are the recorded performance metric (i.e., what the experiment measures). For now, let's assume random results:
import pandas as pd

matrix = pd.DataFrame(
    np.random.randn(2, 2), 
    columns=['method1', 'method2'], 
    index=['dataset1', 'dataset2']
)
  1. Create an instance of the BlockDesign class. The BlockDesign class stores the results and preprocesses them for later use. You can specify precision, threshold...
from src.classifier_comparisons import BlockDesign

block_design = BlockDesign(matrix, threshold=0.01, precision=4, higher_is_better=True)
  1. Give this instance to the appropriate statistical test.
test_results = friedman_test(block_design, alpha=0.05)

Current contents of the package

Assuming that the results are stored in a matrix (let's create a random one for now):

import pandas as pd

matrix = pd.DataFrame(
    np.random.randn(2, 2), 
    columns=['method1', 'method2'], 
    index=['dataset1', 'dataset2']
)

Comparing classifiers:

  1. Compute the average ranks (Friedman)

    from src.classifier_comparisons import BlockDesign
    
    average_ranks = BlockDesign(matrix).to_ranks()
  2. Compute the wins/ties/losses between different methods

    from src.classifier_comparisons import BlockDesign
    
    average_ranks = BlockDesign(matrix).to_wins_ties_losses()

Non-parametric tests:

  1. Friedman test

    from src.multiple_classifiers import friedman_test
    
    block_design = BlockDesign(matrix)
    test_results = friedman_test(block_design, alpha=0.05)

Non-parametric post-hoc tests:

  1. Nemenyi Friedman test

    from src.multiple_classifiers import nemenyi_friedman_test
    
    block_design = BlockDesign(matrix)
    p_values, sign_diffs = nemenyi_friedman_test(block_design, alpha=0.05)
  2. Bonferroni-Dunn test

    from src.multiple_classifiers import bonferroni_dunn_test
    
    block_design = BlockDesign(matrix)
    p_values, sign_diffs = bonferroni_dunn_test(block_design, alpha=0.05)

ml-stats's People

Contributors

vincent-vercruyssen avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.