Giter VIP home page Giter VIP logo

Comments (6)

andrewdalpino avatar andrewdalpino commented on May 20, 2024

Hi @leo23fla thank you for the great bug report

Rubix uses the svm (libsvm) extension to power the SVC, SVR, and One Class SVM estimators. It's entirely possible that the 'model' (created by the extension) is not being preserved through serialization.

I have a couple questions before I start working on this bug

Are you using the Native Serializer or the Binary Serializer?

Also, are you using the Persistent Model meta estimator as a wrapper around SVC or are you using a Persister object directly?

from ml.

LeoHSRodrigues avatar LeoHSRodrigues commented on May 20, 2024

Hi, I'm currently using the Native Serializer and using the Persistent Model.

The train code:

const MODEL_FILE = 'SVC.model';
const PROGRESS_FILE = 'progress.csv';

set_time_limit(0);
ini_set('memory_limit', '-1');

$training = Labeled::build($samples, $labels);

$estimator = new PersistentModel(
    new Pipeline([
        new HTMLStripper(),
        new TextNormalizer(),
        new WordCountVectorizer(1000, 3, new NGram(1, 3)),
        new TfIdfTransformer(),
        new ZScaleStandardizer(),
    ], new SVC(1.0, new Linear(), true, 1e-3, 100.)),
    new Filesystem(MODEL_FILE, true , new Native())
);

$estimator->setLogger(new Screen('sentiment'));

$estimator->train($training);

$estimator->save();

and my load model code:

const MODEL_FILE = 'SVC.model';

set_time_limit(0);
ini_set('memory_limit', '-1');

$estimator = PersistentModel::load(new Filesystem(MODEL_FILE));

$result = $estimator->predict($dataset);

var_dump($result);

from ml.

andrewdalpino avatar andrewdalpino commented on May 20, 2024

Thanks @leo23fla

I was able to reproduce the error

It looks like SVM extension has a separate API for saving and loading the model from disk

I will reach out to the author Ian to see if there is a way we can make the Rubix persistable subsystem jive with the way that the libsvm extension saves/loads the model

There may be some things we can do with magic methods, however I will have to see if they will work with the RedisDB persister (not just disk) as well

The worst case will be that libsvm-based learners won't be able to implement the Persistable interface - instead, we'd have a separate save() and load() method that just takes a file path argument (using the svm extension mechanism under the hood)

I'll keep you posted and feel free to respond with any of your thoughts

from ml.

LeoHSRodrigues avatar LeoHSRodrigues commented on May 20, 2024

Ok. Hope everything goes well.

from ml.

andrewdalpino avatar andrewdalpino commented on May 20, 2024

In the interim, I'd recommend trying out a Neural Network based learner such as Multi Layer Perceptron or Softmax Classifier on your problem

I've persisted/loaded those many times and works without fault

from ml.

andrewdalpino avatar andrewdalpino commented on May 20, 2024

@leo23fla

As a preliminary fix to this bug we've gone ahead and dropped the Persistable contract between all SVM-based learners

Instead, we've put a save() and load() method on each class that allows you to save model data to a file and subsequently load the model data in another process

Documentation can be found here https://github.com/RubixML/RubixML#svc

These operations are independent of the Rubix Persistable subsystem as the php-svm extension is not compatible with it (does not implement serialization of the model data itself)

I've reached out to the author to coordinate a fix the issue, however, I have not had a response yet

If warranted, I will use a separate issue to address getting the SVM-based learners back on the Persistable system

One way that this could work is to use __sleep() and __wakeup() magic methods to essentially save the model params to a file first, read it back, and store it in the learner prior to serialization - however, this sounds janky to me and will probably not implement this (but it's an example of what can be done)

I'm closing the issue, as the bug is technically fixed but feel free to continue the conversation by way of your thoughts and/or questions

Thank you again for the great bug report

from ml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.