Giter VIP home page Giter VIP logo

conformalprediction.jl's People

Contributors

github-actions[bot] avatar john-waczak avatar mojifarmanbar avatar pat-alt avatar pitmonticone avatar rikhuijzer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

conformalprediction.jl's Issues

Add support for Quantile regression

MLJLinearModels.jl has a QuantileRegressor that (I presume) returns and interval. Section 2.2 outlines how Conformalize Quantile Regression can be implemented. Like with #32 I'm not sure how easy it is to simply adapt the score functions in order to then still be able to use all of the different approaches to conformalizing regression. Probably easier than #32 and would definitely be desirable in this case.

CP for LLMs

Issue set up for Experiment Week

  • Study this paper
  • Get data from hugging face
  • Train small transformer model from scratch
  • Look at fine-tuning of pre-trained model

Refactor type hierarchy

Now already looking at quite a few abstract types. It may be more useful to reduce the use of that and instead rely on THTT.

Set-valued predictions for MLJ

The goal for this package is to seamlessly interact with MLJ. To that end all conformal models implement the compulsory MMI.fit and MMI.predict methods (following guidelines set out here).

With respect to downstream tasks (in particular evaluation) we are facing a problem: predictions for standard conformal classifiers are set-valued. Currently MLJ supports Interval-valued predictions, but not Sets (see related discussion #20).

A long-term goal is to align MLJ and this package. Any thoughts/comments/help welcome!

Add traits for custom measures (and later contribute for general use)

    @pat-alt Great to hear about your progress!

Q1: Firstly, should I extend MMI.evaluate to assert that users only use one of the two applicable custom measures?

Generally the kind of target proxy the measure is used for is articulated with the prediction_type trait. (Measures have traits, just like models. The manual mentions this, but you'll want to look also here if you're contributing new measures.) So, you would do something like:

StatisticalTraits.prediction_type(::Type{<:YourMeasureType}) = :probablisitic_set

edited: The model version of this trait is already suitably overloaded here:

https://github.com/JuliaAI/MLJModelInterface.jl/blob/d9e9703947fc04b0a5e63680289e41d0ba0d65bd/src/model_traits.jl#L27

The evaluate apparatus in MLJBase should check the model matches the measure and throw an error if it doesn't. Possibly, as this is a new target proxy type, the behaviour at MLJBase may need to be adjusted. The relevant logic lives approximately here:

https://github.com/JuliaAI/MLJBase.jl/blob/d79f29b78c5068377e25363884e2ea1c4b4a149a/src/resampling.jl#L600

Q2:

Do you always see this rubbish, or just for your custom measure? Where are you viewing this? Is it in an ordinary terminal or VSCode, notebook, other? Could you please try MLJ.color_off() and see if that helps?

Originally posted by @ablaom in #40 (comment)

[Testing] Ensure that empirical coverages check out

To ensure that all methods are implemented correctly, we should verify that the empirical coverage rates. To some extent this has already been done in the documentation, but we can be a bit more serious about it.

  • Add a guide to docs/explanation that runs all implemented methods and compares the empirical coverage to the theoretical coverage (see this table) for reference)
  • Add these sanity checks as unit tests

Conformal Training

Finally add full support for conformal training.

  • Add differentiable sort

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Wrap models not machines?

@pat-alt

Congratualations on the launch of this new package 🎉 Great to have the integration with MLJ!

I'm not familiar with conformal prediction, but I nevertheless wonder why this package wraps MLJ machines rather than models. If you wrap models, then you will buy into MLJ's model composition. So, a "conformally wrapped model" will behave like any other model: you can insert in pipeline, can wrap in tuning strategy, and so forth.

New models in MLJ generally implement the "basement level" model API. Machines are a higher level abstraction for: (i) user interaction; and (ii) syntax for building learning networks which are ultimately "exported" as standalone model types.

Here are other examples of model wrapping in MLJ: EnsembleModel (docs), BinaryThresholdPredictor, TunedModel, IteratedModel. What makes things a little complicated is the model hierarchy: the model supertype for the wrapped model depends on the supertype of the atomic model. So for example, we don't just have EnsembleModel we have DeterministicEnsembleModel (for ordinary point predictors) and ProbabilisticEnsembleModel (for probabilistic predictors) but the user only sees a single constructor EnsembleModel; see here. (A longer term goal is to drop the hierarchy in favour of pure trait interface, which will simplify things, but that's a little ways off yet.)

Happy to provide further guidance.

cc @azev77

Package stability

Hi,

I want to introduce this package to teaching. I know you warn that the API is not stable, but I wanted to check if this warning is serious or just a safety belt and you mostly worked out the required design?

[Feature request] Conformal Predictive Distributions

Hi & thanks for this package.
I've been waiting for a package for conformal prediction...

Here is some sample code from my test drive which may or may not be useful for docs:

using Pkg
Pkg.add.(["MLJ" "EvoTrees" "Plots"])
Pkg.add(url="https://github.com/pat-alt/ConformalPrediction.jl")
using MLJ, EvoTrees, ConformalPrediction, Plots, Random;
########################################
rng=MersenneTwister(49); #rng=Random.GLOBAL_RNG;
n= 100_000; p=7; σ=0.1;
X = [ones(n) randn(rng, n, p-1)]
θ = randn(rng, p)
y = X * θ .+ σ .* randn(rng, n)
train, calibration, test = partition(eachindex(y), 0.4, 0.4)
########################################
EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees
model = EvoTreeRegressor() 
mach = machine(model, X, y)
fit!(mach, rows=train)
pr_y = predict(mach, rows=test)
########################################
conf_mach = conformal_machine(mach)
calibrate!(conf_mach, selectrows(X, calibration), y[calibration])
pr = predict(conf_mach, X[test,:]; coverage=0.95)

pr_lower = [pr[j][1][2][] for j in 1:length(test)]
pr_upper = [pr[j][2][2][] for j in 1:length(test)]

###########################################
plot()
plot!(y[test], lab="y test")
plot!(pr_y, lab="y prediction")
plot!(pr_lower, lab = "y 95% prediction lower bound")
plot!(pr_upper, lab = "y 95% prediction upper bound")

mean(pr_lower .<= y[test] .<= pr_upper)

[Docs] Add more how-to guides

In #42 I've just added a guide explaining how to conformalize a deep learning image classifier. It would be great to have more guides like this. Ideas for use cases and contributions are welcome.

[maybe] Refactor Jackknife and CV methods (DRY)

Currently we have separate constructors for Jackknife and CV methods, e.g. JackknifePlusRegressor and CVPlusRegressor. Since the former is just a special case of the latter (for which nfold=nobs), technically this makes the former constructor redundant. By getting rid of it we could make the code base more DRY ("don't repeat yourself").

The problem with this idea is that at instantiation the models have no access to data, so nobs is unknown. So if we want to keep a separate model type (JackknifePlusRegressor), then making the code more DRY would come with its own complications.

Currently undecided, so won't fix myself, but leaving this here for discussion.

Full module for Conformal Training

  • move existing things into their own module (may use separate package in the future)
  • implement complete algorithm for training
  • tests
  • docs

Do you really need MLJ dependency?

I notice that CP takes a while to load/precompilel. I'd be surprised if you really need MLJ as a dependency. MLJ essentially just collects these components, and you probably don't need all of them:

  • MLJBase
  • MLJModels (maybe, if you use some then built-in transformers)
  • MLJEnsembles (unlikely)
  • MLJIteration (unlikely)
  • MLJTuning

Ordinary model interfaces just need MLJModelInterface, which is very lightweight. But you will need MLJBase if you are using composite model tools like learning networks, pipelines, etc. And if you are extending measures (but this will ultimately move out). You need it for machines too, but I thought that was factored out now, yes?

Add support for ConfTr

This ICLR 2022 paper shows how to train conformal classifiers.

  • Add losses for prediction step (prediction step)
  • Streamline (need separate score method for dealing with MLJFlux) - done in b4c7140
  • Add support for differentiable quantile computations (calibration step)
  • Implement batch training procedure
  • Test and document

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.