Giter VIP home page Giter VIP logo

Comments (6)

vruusmann avatar vruusmann commented on July 18, 2024

I just pushed commit bc19ebc that affects class TreeModelEvaluator.

The return type of method TreeModelEvaluator#evaluate(ModelEvaluationContext) depends on the function type:

  • Classification-type models return an instance of NodeClassificationMap. If you analyze the method NodeClassificationMap#getResult(), then it is easy to see that a non-null score attribute takes priority over the value attribute of the highest-probability ScoreDistribution element.
  • Regression-type models return an instance of NodeScore. The method NodeScore#getResult() returns the value of the score attribute (after it has been converted to double data type and post-processed as specified by the Target element).

The score attribute is required. The PMML specification says the following: "it is not possible that the scoring process ends in a Node which does not have a score attribute".

Regarding the multipleModelMethod="average" issue - are you working with a classification- or regression-type tree ensemble? Is it possible to attach a sample PMML file?

from jpmml-evaluator.

rriegs avatar rriegs commented on July 18, 2024

I've also left a comment over at https://groups.google.com/d/msg/jpmml/Du0QMIYyvko/BAq8n9rBgK4J concerning a separate but related question.

Please see attached model and test file at https://groups.google.com/d/msg/jpmml/Du0QMIYyvko/-bnXhyYblFUJ

I'm working with a classification-type tree ensemble. Regression-type tree ensembles do use score with multipleModelMethod="average" as appropriate.

I see the line you've quoted from the PMML spec and can only conclude that the spec is somewhat internally inconsistent. It does claim that the score attribute is required at final Nodes, but also that ScoreDistribution is used to choose the predictedValue if and only if the score attribute is not provided.

from jpmml-evaluator.

vruusmann avatar vruusmann commented on July 18, 2024

Thank you for the extra input.

As you probably noticed, classification-type ensemble models perform aggregation using the org.jpmml.evaluator.HasProbability interface. During aggregation, there is no distinction between "winner" and "loser" class labels, so the score attribute can be safely ignored.

When speaking about classification-type tree models, then it is safe to say that a PMML document is inconsistent if the value of the score attribute does not have a matching ScoreDistribution element. This inconsistency can be discovered using static analysis. In my opinion, it would be too wasteful to perform consistency checks on every NodeClassificationMap instance in runtime.

Static analyzers can be implemented using the Visitor design pattern. Simply create a subclass of org.jpmml.evaluator.visitors.FeatureInspector and apply it to your PMML class model object right after it is unmarshalled from the PMML document.

As for the quality of the PMML specification, then it is good/unambiguous enough 99% of time. The remaining 1% represents various edge- and corner cases that surface only when the spec is implemented in the actual application code. I hope that the JPMML implementation of such cases agrees with proprietary implementations.

from jpmml-evaluator.

rriegs avatar rriegs commented on July 18, 2024

Thank you for your responses, Villu. I am satisfied with this explanation and resolution. I will modify my code to handle the required score attributes.

from jpmml-evaluator.

vruusmann avatar vruusmann commented on July 18, 2024

Actually, class NodeClassificationMap needs some modification, because it does not implement the method org.jpmml.evaluator.CategoricalResultFeature#getCategoryValues() correctly when there are no ScoreDistribution elements available.

The correct behaviour would be to return a singleton set that contains the score attribute value.

This fix is in the works. I will push it to the repository later in the evening.

from jpmml-evaluator.

vruusmann avatar vruusmann commented on July 18, 2024

Commit d0b1dd8 makes sure that class NodeClassificationMap implements the interface HasProbability (and its superinterface CategoricalResultFeature) correctly in a situation where the Node element does not have any ScoreDistribution child elements.

from jpmml-evaluator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.