dirichletcal / dirichlet_python Goto Github PK

License: MIT License

Python 98.41% Makefile 1.59%

dirichlet_python's Introduction

Dirichlet Calibration Python implementation

This is a Python implementation of the Dirichlet Calibration presented in Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with Dirichlet calibration at NeurIPS 2019.

Installation

# Clone the repository
git clone [email protected]:dirichletcal/dirichlet_python.git
# Go into the folder
cd dirichlet_python
# Create a new virtual environment with Python3
python3.8 -m venv venv
# Load the generated virtual environment
source venv/bin/activate
# Upgrade pip
pip install --upgrade pip
# Install all the dependencies
pip install -r requirements.txt
pip install --upgrade jaxlib

Unittest

python -m unittest discover dirichletcal

Cite

If you use this code in a publication please cite the following paper

@inproceedings{kull2019dircal,
  title={Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with Dirichlet calibration},
  author={Kull, Meelis and Nieto, Miquel Perello and K{\"a}ngsepp, Markus and Silva Filho, Telmo and Song, Hao and Flach, Peter},
  booktitle={Advances in Neural Information Processing Systems},
  pages={12295--12305},
  year={2019}
}

Examples

You can find some examples on how to use this package in the folder examples

Pypi

To push a new version to Pypi first build the package

python3.8 setup.py sdist

And then upload to Pypi with twine

twine upload dist/*

It may require user and password if these are not set in your home directory a file .pypirc

[pypi]
username = __token__
password = pypi-yourtoken

dirichlet_python's People

Contributors

Stargazers

Watchers

Forkers

maximlippeveld tor4z epochsimate shivamtundele xjtushujun idroz masongrosko mrowebot

dirichlet_python's Issues

Full DirichletCalibrator throwing error

Full DirichletCalibrator here is throwing error as below,

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/dirichletcal/__init__.py in fit(self, x, y, x_val, y_val, **kwargs)
     45             self.calibrator_ = FullDirichletCalibrator(
     46                 reg_lambda_list=self.l2_grid, reg_mu_list=self.comp_l2,
---> 47                 initializer=self.initializer)
     48         elif self.matrix_type == 'full_gen':
     49             self.calibrator_ = GenerativeDirichletCalibrator()

TypeError: __init__() got an unexpected keyword argument 'reg_lambda_list'

Still getting error in using all dirichlet methods in mapped calibrators

lib/python3.6/site-packages/dirichletcal/__init__.py in fit(self, x, y, x_val, y_val, **kwargs)
     34 
     35     def fit(self, x, y, x_val=None, y_val=None, **kwargs):
---> 36         __self.setup()
     37 
     38         if self.matrix_type == 'diagonal':

NameError: name '_DirichletCalibrator__self' is not defined

Add generative fitting

We tested the generative fitting for full Dirichlet in the past in commit 739b64b

Should we consider adding it again?

AttributeError: 'FullDirichletCalibrator' object has no attribute 'weights_init'

When I run python example/calibration_example.py, the output is:

Training a classifier with cross-validation
Crossval scores: [-0.13321234 -0.05923569 -0.16739947]
Average neg log loss -0.120
Traceback (most recent call last):
  File "calibration_example.py", line 37, in <module>
    gscv.fit(cla_scores_train, y_train)
  File "/home/csong-wps/github/dirichlet_python/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/home/csong-wps/github/dirichlet_python/venv/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 765, in fit
    base_estimator = clone(self.estimator)
  File "/home/csong-wps/github/dirichlet_python/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/home/csong-wps/github/dirichlet_python/venv/lib/python3.6/site-packages/sklearn/base.py", line 74, in clone
    new_object_params = estimator.get_params(deep=False)
  File "/home/csong-wps/github/dirichlet_python/venv/lib/python3.6/site-packages/sklearn/base.py", line 195, in get_params
    value = getattr(self, key)
AttributeError: 'FullDirichletCalibrator' object has no attribute 'weights_init'

I think the possible reason is the incompatible version of the scikit-learn. Which version do you use in the test? (My version is 0.24.0)

DirichletCalibrator with full l2 regularization throwing error on passing l2_list

The DirichletCalibrator is throwing error on passing l2_list for reg_lambda parameter as you originally did in your experiment here and you passed your l2_list as here.

In my case, its throwing error as below

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/dirichletcal/calib/fulldirichlet.py in fit(self, X, y, X_val, y_val, *args, **kwargs)
     46                                         reg_norm=self.reg_norm,
     47                                         ref_row=self.ref_row)
---> 48         self.calibrator_.fit(_X, y, *args, **kwargs)
     49         final_loss = log_loss(y_val, self.calibrator_.predict_proba(_X_val))
     50 

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/dirichletcal/calib/multinomial.py in fit(self, X, y, *args, **kwargs)
     95                                      reg_mu=self.reg_mu, ref_row=self.ref_row,
     96                                      initializer=self.initializer,
---> 97                                      reg_format=self.reg_format)
     98         else:
     99             res = scipy.optimize.fmin_l_bfgs_b(func=_objective, fprime=_gradient,

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/dirichletcal/calib/multinomial.py in _newton_update(weights_0, X, XX_T, target, k, method_, maxiter, ftol, gtol, reg_lambda, reg_mu, ref_row, initializer, reg_format)
    233     L_list = [raw_np.float(_objective(weights_0, X, XX_T, target, k, method_,
    234                                       reg_lambda, reg_mu, ref_row, initializer,
--> 235                                       reg_format))]
    236 
    237     weights = weights_0.copy()

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/dirichletcal/calib/multinomial.py in _objective(params, *args)
    151         else:
    152             reg = np.zeros((k, k+1))
--> 153         loss = loss + reg_lambda * np.sum((weights - reg)**2)
    154     else:
    155         weights_hat = weights - np.hstack([weights[:, :-1] * np.eye(k),

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/lib/python3.6/site-packages/jax/interpreters/xla.py in _forward_method(attrname, self, fun, *args)
    985 
    986 def _forward_method(attrname, self, fun, *args):
--> 987   return fun(getattr(self, attrname), *args)
    988 _forward_to_value = partial(_forward_method, "_value")
    989 

TypeError: only integer scalar arrays can be converted to a scalar index

ValueError: failed to initialize intent(inout) array -- input not fortran contiguous (more than 36 classes)

When calibrating a problem with more than 36 classes the underling MultinomialRegression uses scipy.optimize.fmin_l_bfgs_b instead of _newton_update which seems to raise an exception. Somebody reports that even if the number of classes is small, using scipy.optimize.fmin_l_bfgs_b raises the same exception (not tested yet)

from sklearn.datasets import make_classification                                
from sklearn.preprocessing import MinMaxScaler                                  
                                                                                
n_classes = 37                                                                  
X, y = make_classification(n_samples=1000, n_features=n_classes,                
                           n_classes=n_classes, n_clusters_per_class=1,         
                           n_informative=6)                                     
                                                                                
scaler = MinMaxScaler()                                                         
X = scaler.fit_transform(X)                                                     
                                                                                
from dirichletcal.calib.fulldirichlet import FullDirichletCalibrator            
                                                                                
calibrator = FullDirichletCalibrator()                                          
                                                                                
calibrator.fit(X, y)

Raises the following exception

~/git/dirichlet_python/venv/lib/python3.8/site-packages/scipy/optimize/lbfgsb.py in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, finite_diff_rel_step, **unknown_options)
    349     while 1:
    350         # x, f, g, wa, iwa, task, csave, lsave, isave, dsave = \
--> 351         _lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
    352                        pgtol, wa, iwa, task, iprint, csave, lsave,
    353                        isave, dsave, maxls)

ValueError: failed to initialize intent(inout) array -- input not fortran contiguous

Base class for all Dirichlet versions

We originally had one base class that worked as a front end https://github.com/dirichletcal/dirichlet_python/blob/master/dirichletcal/__init__.py#L11

at some point we started ussing the individual classes in dirichletcal.calib

We may need to unify them again into the base class, and add unittests for the base class.

Typo in DirichletCalibrator

Seem like there's a typo error while running the code even after you fixed other errors in your last commit f36fc1f @perellonieto .

The error is at this line.

/python3.6/site-packages/dirichletcal/__init__.py in fit(self, x, y, x_val, y_val, **kwargs) 34 35 def fit(self, x, y, x_val=None, y_val=None, **kwargs): ---> 36 __self.setup() 37 38 if self.matrix_type == 'diagonal': NameError: name '_DirichletCalibrator__self' is not defined

Seems like that's a typo in the fit. It should be self._setup, not _self.setup. Not sure if its the same in other parts of the code.

Add missing unittests

It seems that the current unittests do not cover some classes (as highlighed by some raised issues). Would be good to have a list here of the missing coverage and open issues with each individual class that is missing.

dirichletcal.calib.diagdirichlet
dirichletcal.calib.fixeddirichlet

Pin dependencies in requirements.txt to ensure reproducibility

None of the dependencies in requirements.txt have pinned versions, so installing them and running the tests will fail - for instance, the label_binarizer in Scikit-learn now requires a keyword arg.

Question in regards to weights magnitude in Calibration Map - Linear parametrisation

Hi, great paper on NeuralPs first of all.

I was reading the paper and found that we assume the W matrix has the following property:

W_ij = Alpha_ij - 1

We know that Alpha_ij > 0 from Dirchilet, and thus W_ij must be higher than -1, I was wondering how this constraint was reflected in the multinomial.py script. (Let me know if this was the wrong script enforcing that