jpmml / jpmml-evaluator Goto Github PK
View Code? Open in Web Editor NEWJava Evaluator API for PMML
License: GNU Affero General Public License v3.0
Java Evaluator API for PMML
License: GNU Affero General Public License v3.0
Hi,I have got a problem,my scene is Model iteration by every day,but the framework of jpmml use LoadingCache as cache, that has a characteristics of delaying to delete.so jpmml leads to jvm memory is very big, even OOM.
The solution : At the same time using weakKeys() and weakValues():
private static LoadingCache<MiningModel, BiMap<String, Segment>> entityCache = CacheUtil.buildLoadingCache(new CacheLoader<MiningModel, BiMap<String, Segment>>(){
@Override
public BiMap<String, Segment> load(MiningModel miningModel){
Segmentation segmentation = miningModel.getSegmentation();
return EntityUtil.buildBiMap(segmentation.getSegments());
}
});
16/12/12 19:04:26 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, byd0158): java.lang.NoSuchMethodError: org.dmg.pmml.MiningField.getOptype()Lorg/dmg/pmml/OpType;
at org.jpmml.evaluator.ArgumentUtil.isOutlier(ArgumentUtil.java:153)
at org.jpmml.evaluator.ArgumentUtil.prepare(ArgumentUtil.java:69)
at org.jpmml.evaluator.ModelEvaluator.prepare(ModelEvaluator.java:110)
at org.jpmml.spark.PMMLTransformer$2.apply(PMMLTransformer.java:120)
at org.jpmml.spark.PMMLTransformer$2.apply(PMMLTransformer.java:110)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply119245_186$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Why do subclasses of Computable like InstanceClassificationMap throw an EvaluationException if the getResult is called and result is null (e.g: https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/InstanceClassificationMap.java)? Wouldn't it be more correct to leave the interpretation of an expected/unexpected "null" result up to the caller?
I am working on RulesInduction model and JPMML keeps complaining about this file whose contents i have copied in this issue. I am not able to figure out what is the problem. Please help me with it.
Exception:
org.jpmml.evaluator.EvaluationException
at org.jpmml.evaluator.CategoricalValue.compareToString(CategoricalValue.java:39)
at org.jpmml.evaluator.FieldValue.compareTo(FieldValue.java:143)
at org.jpmml.evaluator.PredicateUtil.evaluateSimplePredicate(PredicateUtil.java:131)
at org.jpmml.evaluator.PredicateUtil.evaluatePredicate(PredicateUtil.java:63)
at org.jpmml.evaluator.PredicateUtil.evaluate(PredicateUtil.java:51)
at org.jpmml.evaluator.PredicateUtil.evaluateCompoundPredicateInternal(PredicateUtil.java:200)
at org.jpmml.evaluator.PredicateUtil.evaluateCompoundPredicate(PredicateUtil.java:168)
at org.jpmml.evaluator.PredicateUtil.evaluatePredicate(PredicateUtil.java:71)
at org.jpmml.evaluator.PredicateUtil.evaluate(PredicateUtil.java:51)
at org.jpmml.evaluator.RuleSetModelEvaluator.evaluateRule(RuleSetModelEvaluator.java:190)
at org.jpmml.evaluator.RuleSetModelEvaluator.evaluateRules(RuleSetModelEvaluator.java:216)
at org.jpmml.evaluator.RuleSetModelEvaluator.evaluateClassification(RuleSetModelEvaluator.java:109)
at org.jpmml.evaluator.RuleSetModelEvaluator.evaluate(RuleSetModelEvaluator.java:84)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:406)
at com.norkom.blake.pmml.IrepRuleTest.makePredictions(IrepRuleTest.java:177)
at com.norkom.blake.pmml.IrepRuleTest.testRules(IrepRuleTest.java:106)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
PMML file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_2">
<DataDictionary numberOfFields="13">
<DataField name="TransactionType" optype="categorical" dataType="string">
<Value value="ATM"/>
<Value value="Point of Sale"/>
<Value value="Point of Sale BGC"/>
<Value value="Term Deposit Post Office"/>
</DataField>
<DataField name="Amount" optype="continuous" dataType="double"/>
<DataField name="CreditOrDebit" optype="categorical" dataType="string">
<Value value="Credit"/>
<Value value="Debit"/>
</DataField>
<DataField name="Currency" optype="categorical" dataType="string">
<Value value="DOLLAR"/>
<Value value="EUR"/>
</DataField>
<DataField name="POSAmount3Days.acc.day.present" optype="continuous" dataType="double"/>
<DataField name="POSAmount3Days.acc.day.total" optype="continuous" dataType="double"/>
<DataField name="POSAmount4hr.acc.hour4" optype="continuous" dataType="double"/>
<DataField name="POSAmount60Mins.acc.minute60" optype="continuous" dataType="double"/>
<DataField name="POSCount3Days.cnt.day.present" optype="continuous" dataType="double"/>
<DataField name="POSCount3Days.cnt.day.total" optype="continuous" dataType="double"/>
<DataField name="POSCount4hr.cnt.hour4" optype="continuous" dataType="double"/>
<DataField name="POSCount60Mins.cnt.minute60" optype="continuous" dataType="double"/>
<DataField name="Fraud" optype="categorical" dataType="double">
<Value value="0.0"/>
<Value value="1.0"/>
</DataField>
</DataDictionary>
<RuleSetModel modelName="RulesSetModel" functionName="classification">
<MiningSchema>
<MiningField name="TransactionType"/>
<MiningField name="Amount"/>
<MiningField name="CreditOrDebit"/>
<MiningField name="Currency"/>
<MiningField name="POSAmount3Days.acc.day.present"/>
<MiningField name="POSAmount3Days.acc.day.total"/>
<MiningField name="POSAmount4hr.acc.hour4"/>
<MiningField name="POSAmount60Mins.acc.minute60"/>
<MiningField name="POSCount3Days.cnt.day.present"/>
<MiningField name="POSCount3Days.cnt.day.total"/>
<MiningField name="POSCount4hr.cnt.hour4"/>
<MiningField name="POSCount60Mins.cnt.minute60"/>
<MiningField name="Fraud" usageType="target"/>
</MiningSchema>
<RuleSet recordCount="5152.0" nbCorrect="5033.0" defaultScore="0" defaultConfidence="0.0">
<RuleSelectionMethod criterion="firstHit"/>
<SimpleRule id="Rule0" score="1.0" recordCount="95.0" nbCorrect="89.0" confidence="0.9325842696629213">
<CompoundPredicate booleanOperator="and">
<SimplePredicate field="POSAmount4hr.acc.hour4" operator="greaterOrEqual" value="104.1"/>
<SimplePredicate field="POSAmount4hr.acc.hour4" operator="greaterOrEqual" value="182.63"/>
</CompoundPredicate>
<ScoreDistribution value="0.0" recordCount="6.0"/>
<ScoreDistribution value="1.0" recordCount="89.0"/>
</SimpleRule>
<SimpleRule id="Rule1" score="1.0" recordCount="8.0" nbCorrect="8.0" confidence="1.0">
<CompoundPredicate booleanOperator="and">
<SimplePredicate field="POSAmount4hr.acc.hour4" operator="greaterOrEqual" value="80.0"/>
<SimplePredicate field="Amount" operator="greaterOrEqual" value="104.1"/>
<SimplePredicate field="Amount" operator="lessOrEqual" value="104.16"/>
</CompoundPredicate>
<ScoreDistribution value="0.0" recordCount="0.0"/>
<ScoreDistribution value="1.0" recordCount="8.0"/>
</SimpleRule>
<SimpleRule id="Rule2" score="1.0" recordCount="16.0" nbCorrect="13.0" confidence="0.7692307692307693">
<CompoundPredicate booleanOperator="and">
<SimplePredicate field="POSAmount60Mins.acc.minute60" operator="greaterOrEqual" value="37.64"/>
<SimplePredicate field="POSAmount3Days.acc.day.present" operator="greaterOrEqual" value="148.57"/>
<SimplePredicate field="TransactionType" operator="greaterOrEqual" value="13.0"/>
<SimplePredicate field="POSAmount3Days.acc.day.present" operator="greaterOrEqual" value="261.19"/>
</CompoundPredicate>
<ScoreDistribution value="0.0" recordCount="3.0"/>
<ScoreDistribution value="1.0" recordCount="13.0"/>
</SimpleRule>
<SimpleRule id="Rule3" score="1.0" recordCount="8.0" nbCorrect="6.0" confidence="0.6666666666666666">
<CompoundPredicate booleanOperator="and">
<SimplePredicate field="POSAmount4hr.acc.hour4" operator="greaterOrEqual" value="90.95"/>
<SimplePredicate field="Amount" operator="greaterOrEqual" value="147.57"/>
</CompoundPredicate>
<ScoreDistribution value="0.0" recordCount="2.0"/>
<ScoreDistribution value="1.0" recordCount="6.0"/>
</SimpleRule>
<SimpleRule id="Rule4" score="1.0" recordCount="4.0" nbCorrect="3.0" confidence="0.6666666666666666">
<CompoundPredicate booleanOperator="and">
<SimplePredicate field="POSAmount4hr.acc.hour4" operator="greaterOrEqual" value="90.95"/>
<SimplePredicate field="POSAmount60Mins.acc.minute60" operator="lessOrEqual" value="90.95"/>
</CompoundPredicate>
<ScoreDistribution value="0.0" recordCount="1.0"/>
<ScoreDistribution value="1.0" recordCount="3.0"/>
</SimpleRule>
</RuleSet>
</RuleSetModel>
</PMML>
As I'm using NN, and providing API for prediction querying.
I am expecting normal input params like age=26,gender=m
.
So I have to use some pre-processing work before input these into nn-evaluator.
Does JPMML
support TransformationDictionary
?
If yes, in which package? and how?
If no, any plan scheduled?
In 1.1.7, when we try to consume a logistic regression under RegressionModel, we encountered the below error message.
We also tried linear regression and regression with more than two categories, they are working all fine. We also tried to switch back to 1.1.3, under 1.1.3, the logistic regression works fine also.
Exception in thread "main" org.jpmml.manager.InvalidFeatureException (at or around line 33): RegressionModel
at org.jpmml.evaluator.RegressionModelEvaluator.evaluateClassification(RegressionModelEvaluator.java:130)
at org.jpmml.evaluator.RegressionModelEvaluator.evaluate(RegressionModelEvaluator.java:71)
at org.jpmml.evaluator.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:425)
at org.jpmml.evaluator.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:211)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:108)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:86)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:68)
at org.jpmml.evaluator.CsvEvaluationExample.evaluateAll(CsvEvaluationExample.java:226)
at org.jpmml.evaluator.CsvEvaluationExample.execute(CsvEvaluationExample.java:97)
at org.jpmml.evaluator.Example.execute(Example.java:45)
at org.jpmml.evaluator.CsvEvaluationExample.main(CsvEvaluationExample.java:72)
Hi Villu,
I have also attached my file
)
I am generating PMML for NeuralNetwork but when i use the evaluator it keeps throwing this exception.
org.jpmml.evaluator.InvalidFeatureException: MiningField
at org.jpmml.evaluator.IndexableUtil.buildMap(IndexableUtil.java:72)
at org.jpmml.evaluator.IndexableUtil.buildMap(IndexableUtil.java:61)
at org.jpmml.evaluator.ModelEvaluator$4.load(ModelEvaluator.java:688)
at org.jpmml.evaluator.ModelEvaluator$4.load(ModelEvaluator.java:684)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
at com.google.common.cache.LocalCache.get(LocalCache.java:4053)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986)
at org.jpmml.evaluator.CacheUtil.getValue(CacheUtil.java:51)
at org.jpmml.evaluator.ModelEvaluator.<init>(ModelEvaluator.java:128)
at org.jpmml.evaluator.neural_network.NeuralNetworkEvaluator.<init>(NeuralNetworkEvaluator.java:90)
at org.jpmml.evaluator.neural_network.NeuralNetworkEvaluator.<init>(NeuralNetworkEvaluator.java:86)
at com.baesystems.ai.analytics.smile.pmml.NeuralNetworkPMMLTest.createEvaluator(NeuralNetworkPMMLTest.java:130)
at com.baesystems.ai.analytics.smile.pmml.NeuralNetworkPMMLTest.testLeastMeanSqaures(NeuralNetworkPMMLTest.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
From the PMML spec (versions 2.0 and up):
When a Node is selected as the final Node and if this Node has no
score
attribute, then the highestrecordCount
in the ScoreDistribution determines which value is selected as the predicted class. If a Node contains a sequence of ScoreDistribution elements such that there is more than one entry where recordCount_i is an upper bound, then the first entry is selected.Note: If a Node has an attribute
score
then this attribute value overrides the computation of a predicted value from the ScoreDistribution.
The above suggests that it should be OK for a terminal Node in a TreeModel to omit the score
attribute so long as it contains at least one ScoreDistribution element and, further, that including a score
attribute may in fact weaken the contribution of the ScoreDistributions (though it is of course always possible to add a score
attribute that accurately reflects the behavior specified in the above).
Note that, when using multipleModelMethod="average"
for a series of TreeModels, jpmml-evaluator (as of 1.1.17) appears to completely ignore the score
attributes (i.e. you can set them all to "foo"
), instead relying entirely on the ScoreDistributions to make its prediction. It seems odd to be required to provide an attribute that isn't going to be used at all.
I'm using Scala and SBT. In my build.sbt
, I added this line:
libraryDependencies += "org.jpmml" % "jpmml-evaluator" % "1.3.3"
But I still got error jpmml is not a member of package org when importing.
For more information: Scala version is 2.11.8
my scenario is this , I train an random forest pmml file
I use multipleModelMethod=weightedAverage , and want to output label's multi class probability
the pmml file like this
then it throws an Exception
org.jpmml.evaluator.TypeCheckException: Expected org.jpmml.evaluator.HasProbability, but got org.jpmml.evaluator.ClassificationMap ({0=0.6526508348685987, 1=0.3473491651314011})
at org.jpmml.evaluator.OutputUtil.asResultFeature(OutputUtil.java:848)
at org.jpmml.evaluator.OutputUtil.getProbability(OutputUtil.java:478)
at org.jpmml.evaluator.OutputUtil.evaluate(OutputUtil.java:182)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:117)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:85)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:79)
at com.alipay.mymdp.model.component.impl.pmml.engine.PmmlComponentEngine.execute(PmmlComponentEngine.java:49)
at com.alipay.mymdp.model.component.impl.pmml.engine.PmmlComponentEngine.executePmmlComponentEngine(PmmlComponentEngine.java:35)
at com.alipay.mymdp.model.component.impl.pmml.engine.TestPmmlComponentEngine.testRF2PmmlCom
Hi villu,
ProbabilityDistribution prob = (ProbabilityDistribution) results.get(evaluator.getTargetField().getName());
This is returning me null. I dont know whats going on.
I have tried to match my PMML with the example that you showed me but even then its failing.
Can you please look into it and guide me.
Thanks
Hi,
I keep getting this exception for my LogisticRegression Model and LinearRegressionModel. This my xml. Please guide me as what is the problem.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_2">
<DataDictionary numberOfFields="9">
<DataField name="Attribute0" optype="continuous" dataType="double"/>
<DataField name="Attribute1" optype="continuous" dataType="double"/>
<DataField name="Attribute2" optype="continuous" dataType="double"/>
<DataField name="Attribute3" optype="continuous" dataType="double"/>
<DataField name="Attribute4" optype="continuous" dataType="double"/>
<DataField name="Attribute5" optype="continuous" dataType="double"/>
<DataField name="Attribute6" optype="continuous" dataType="double"/>
<DataField name="Attribute7" optype="continuous" dataType="double"/>
<DataField name="Attribute8" optype="continuous" dataType="double"/>
</DataDictionary>
<RegressionModel functionName="classification" algorithmName="logisticRegression" normalizationMethod="logit">
<MiningSchema>
<MiningField name="Attribute0"/>
<MiningField name="Attribute1"/>
<MiningField name="Attribute2"/>
<MiningField name="Attribute3"/>
<MiningField name="Attribute4"/>
<MiningField name="Attribute5"/>
<MiningField name="Attribute6"/>
<MiningField name="Attribute7"/>
<MiningField name="Attribute8" usageType="target"/>
</MiningSchema>
<RegressionTable intercept="0.0" targetCategory="1"/>
<RegressionTable intercept="-8.397856251858588" targetCategory="0">
<NumericPredictor name="Attribute0" coefficient="0.1230185712966992"/>
<NumericPredictor name="Attribute1" coefficient="0.03514316177407176"/>
<NumericPredictor name="Attribute2" coefficient="-0.013282878621280676"/>
<NumericPredictor name="Attribute3" coefficient="6.631624570875322E-4"/>
<NumericPredictor name="Attribute4" coefficient="-0.0011962985482762522"/>
<NumericPredictor name="Attribute5" coefficient="0.08961636497438935"/>
<NumericPredictor name="Attribute6" coefficient="0.943894934066085"/>
<NumericPredictor name="Attribute7" coefficient="0.014842809237409734"/>
</RegressionTable>
</RegressionModel>
</PMML>
This is .xml file for LinearRegression:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_2">
<DataDictionary numberOfFields="9">
<DataField name="Attribute0" optype="continuous" dataType="double"/>
<DataField name="Attribute1" optype="continuous" dataType="double"/>
<DataField name="Attribute2" optype="continuous" dataType="double"/>
<DataField name="Attribute3" optype="continuous" dataType="double"/>
<DataField name="Attribute4" optype="continuous" dataType="double"/>
<DataField name="Attribute5" optype="continuous" dataType="double"/>
<DataField name="Attribute6" optype="continuous" dataType="double"/>
<DataField name="Attribute7" optype="continuous" dataType="double"/>
<DataField name="Attribute8" optype="continuous" dataType="double"/>
</DataDictionary>
<RegressionModel functionName="regression" algorithmName="LinearRegression" normalizationMethod="logit">
<MiningSchema>
<MiningField name="Attribute0"/>
<MiningField name="Attribute1"/>
<MiningField name="Attribute2"/>
<MiningField name="Attribute3"/>
<MiningField name="Attribute4"/>
<MiningField name="Attribute5"/>
<MiningField name="Attribute6"/>
<MiningField name="Attribute7"/>
<MiningField name="Attribute8" usageType="target"/>
</MiningSchema>
<RegressionTable intercept="-8.397856251858588" targetCategory="0">
<NumericPredictor name="Attribute0" coefficient="0.1230185712966992"/>
<NumericPredictor name="Attribute1" coefficient="0.03514316177407176"/>
<NumericPredictor name="Attribute2" coefficient="-0.013282878621280676"/>
<NumericPredictor name="Attribute3" coefficient="6.631624570875322E-4"/>
<NumericPredictor name="Attribute4" coefficient="-0.0011962985482762522"/>
<NumericPredictor name="Attribute5" coefficient="0.08961636497438935"/>
<NumericPredictor name="Attribute6" coefficient="0.943894934066085"/>
<NumericPredictor name="Attribute7" coefficient="0.014842809237409734"/>
</RegressionTable>
</RegressionModel>
</PMML>
Here is the stack trace
org.jpmml.evaluator.InvalidFeatureException: DataField
at org.jpmml.evaluator.RegressionModelEvaluator.evaluateClassification(RegressionModelEvaluator.java:119)
at org.jpmml.evaluator.RegressionModelEvaluator.evaluate(RegressionModelEvaluator.java:69)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:406)
at com.norkorm.blake.pmml.LogisticRegressionPMMLTest.makePredictions(LogisticRegressionPMMLTest.java:250)
at com.norkorm.blake.pmml.LogisticRegressionPMMLTest.testLogisticPMML(LogisticRegressionPMMLTest.java:217)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:176)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:131)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
An exception is thrown when you pass a BigDecimal into the ModelEvaluator method prepare(activeField, rawValue)
. This is because TypeUtil.getDataType(Object value)
does not check for BigDecimal values and an EvaluationException is thrown.
BigDecimal values are recognized as superior to Doubles/Floats for financial calculations and considered 'best practice'. It is recommended that the JPMML framework handles them without having to convert to a Double first.
Adding the following to the getDataType(Object value)
method should resolve this issue:
if(value instanceof BigDecimal){
return DataType.DOUBLE;
} else
In addition: improving the message provided within the EvaluationException would also help with diagnosis of future issues. For example, the message 'the class java.math.BigDecimal is not a supported type' would improve the usability of the framework.
Hi,
I am new to clustering pmml model execution using jpmml evaluator.
Getting exception when I am running below line.
java -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model D:\analytics\Test\pmml\AuditKMeans.pmml --input D:\analytics\Test\csv\AuditData_Test.csv--output D:\analytics\Test\output\Audit_KmeansRes.csv
Exception in thread "main" java.lang.IllegalArgumentException: Missing active field(s): [Age, Income, Deductions, Hours]
at org.jpmml.evaluator.EvaluationExample.execute(EvaluationExample.java:217)
at org.jpmml.evaluator.Example.execute(E
Sample.zip
xample.java:60)
at org.jpmml.evaluator.EvaluationExample.main(EvaluationExample.java:127)
I have attached input and model files.
Sample.zip
Please help to sort out of the above exception.
According to http://www.dmg.org/v4-2-1/RuleSet.html#RuleSet it should be possible to define a default score for a rule set which is returned when none of the rules fire. However, OpenScoring returns a server error in this scenario.
My pmml model:
<PMML xmlns="http://www.dmg.org/PMML-4_2" version="4.2">
<DataDictionary numberOfFields="1">
<DataField name="$Result" displayName="$Result" optype="categorical" dataType="string"/>
</DataDictionary>
<RuleSetModel modelName="Trivial" functionName="classification" algorithmName="RuleSet">
<MiningSchema>
<MiningField name="$Result" usageType="target"/>
</MiningSchema>
<LocalTransformations>
<DerivedField name="foobar" displayName="foobar" optype="categorical" dataType="boolean">
<Constant>true</Constant>
</DerivedField>
</LocalTransformations>
<RuleSet defaultScore="True" defaultConfidence="0.0">
<RuleSelectionMethod criterion="firstHit"/>
<SimpleRule id="RULE1" score="Something">
<SimplePredicate field="foobar" operator="equal" value="false"/>
</SimpleRule>
</RuleSet>
</RuleSetModel>
</PMML>
JSON request:
{
"id": "example-001",
"arguments": {}
}
The result:
$ curl -X POST --data-binary @trivial-example-request.json -H "Content-type: application/json" http://localhost:8080/openscoring/model/trivial
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 </title>
</head>
<body>
<h2>HTTP ERROR: 500</h2>
<p>Problem accessing /openscoring/model/trivial. Reason:
<pre> Internal Server Error</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
</body>
</html>
When I run the following code
Object rawValue = 1.0;
FieldValue activeValue = input.prepare(rawValue);
The error always happen:
Exception in thread "main" org.jpmml.evaluator.InvalidResultException
at org.jpmml.evaluator.FieldValueUtil.performInvalidValueTreatment(FieldValueUtil.java:190)
at org.jpmml.evaluator.FieldValueUtil.prepareInputValue(FieldValueUtil.java:94)
at org.jpmml.evaluator.InputField.prepare(InputField.java:64)
at cn.pmml.test1.PMMLTest.arguments(PMMLTest.java:87)
at cn.pmml.test1.PMMLTest.main(PMMLTest.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
I tried many ways to fix that, but all failed.
Hi,
Using jpmml evaluator for SVM pmml model execution for Audit data set working fine , But for user data getting wrong results. Actually the data set having 4 fields in that one is target field,contains three categories.
I have used below line for execution in my console.
java -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model model.pmml --input input.tsv --output output.tsv
Please help me on the above mention query.
Thanks in advance...
Hello,
In this commit (a309d50) "Restricted the visibility of EvaluationContext constructors", you remove the access control "public", so that I cannot new java object "PMMLEvaluationContext"
When running the following code
ModelEvaluator<RegressionModel> modelEvaluator = new RegressionModelEvaluator(model); Evaluator evaluator = (Evaluator) modelEvaluator;
I got error like this :
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.from(Lcom/google/common/cache/CacheBuilderSpec;)Lcom/google/common/cache/CacheBuilder;
My .pmml file contains ~900 input fields of type double.
I'm running an application which runs on a multi-threaded environment evaluating with 30 threads.
Since there's a method in org.jpmml.evaluator.TypeUtil Line 208 - return (Double.parseDouble(value) + 0d); it has one synchronized method which blocks 29 threads and affects the overall performance
Ref: http://dalelane.co.uk/blog/?p=2936
I did a workaround adding this class from
https://gist.github.com/dalelane/7720269
and calling
return (DoubleParser.parseDouble(value) + 0d);
on line 208 which solved the issued.
Suggest you to do the same if required.
hi
i got a very simple message
Apr 20, 2016 5:21:58 PM org.openscoring.client.CsvEvaluator run
WARNUNG: CSV evaluation failed: Mark invalid
form the evaluation:
"java -cp $jpmml/target/client-executable-1.2-SNAPSHOT.jar org.openscoring.client.CsvEvaluator --model http://localhost:8080/openscoring/model/460012_p_aktiv --input ~/test.csv --output ~/test_output.csv
any Idea???
has it to do with my missing (jpmml- xgboost)
mpg.dmatrix = genDMatrix(mpg_y, mpg_X, "xgboost.svm")
part?? I realised that i dont need xgboost.svm in order to get the pmml file
i simple used
xgboost(param=param,
data = data.matrix(training[,feature.names]),
label=training$aktiv_target,
nrounds=trounds_tmp,
base_score = base,
missing=NA
)
so I used the implicit transform of the data from xgboost
I get different results in evaluation from using predict in R in comparison in using published pmml code via jpmml-xgboost and openscoring
interested in sample data set? and the r code?!
Hello Vilu,
I've trained a ensemble.GradientBoostingClassifier classifier and deployed it to openscoring but I keep getting 400 after the requests.
Using the same pipeline to generate the pmml (using sklearn2pmml) and requesting with the same input works well on simplier models (like linear_model.LogisticRegression()).
Is GradientBoostingClassifier supported by the sklearn2pmml but not by openscoring?
Thanks!
Hi,
am new to this, while i have execute below exception is occuring.
R-PMML tree model getting exception,
used below line for execution:
D:\JPMML\jpmml-evaluator-master\pmml-evaluator-example>java -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model D:\JPMML\Test\pmml\IrisTree.pmml --input D:\JPMML\Test\csv\Iris.csv --output D:\JPMML\Test\output\TreeOutput.csv
Exception in thread "main" org.jpmml.evaluator.DuplicateValueException: class
at org.jpmml.evaluator.EvaluationContext.declare(EvaluationContext.java:91)
at org.jpmml.evaluator.OutputUtil.evaluate(OutputUtil.java:330)
at org.jpmml.evaluator.TreeModelEvaluator.evaluate(TreeModelEvaluator.java:93)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:406)
at org.jpmml.evaluator.EvaluationExample.execute(EvaluationExample.java:261)
at org.jpmml.evaluator.Example.execute(Example.java:60)
at org.jpmml.evaluator.EvaluationExample.main(EvaluationExample.java:127)
Hi,
following code (using 1.2.5 release):
final Map<FieldName, ?> results = kMeansModel.evaluate(params);
for (final Entry<FieldName, ?> resultEntry : results.entrySet())
{
System.out.printf("%s = %s%n", resultEntry.getKey(), resultEntry.getValue());
}
returns this:
null = ClusterAffinityDistribution{result=5, distance_entries=[1=46.498128117308376, 2=47.12002804402491, 3=49.17335819210169, 4=43.117652229258695, 5=39.95722874558617, 6=45.533022467040844, 7=46.711182656888525], entityId=5}
predictedValue = 5
clusterAffinity_1 = 39.95722874558617
clusterAffinity_2 = 39.95722874558617
clusterAffinity_3 = 39.95722874558617
clusterAffinity_4 = 39.95722874558617
clusterAffinity_5 = 39.95722874558617
clusterAffinity_6 = 39.95722874558617
clusterAffinity_7 = 39.95722874558617
shouldn't the clusterAffinity_? have the same values as in the first line?
Regards,
Juraj.
Hello,
could you please add (and maintain) a changelog?
Cheers,
Thomas
Sorry to trouble you again~
The jpmml works well when I use LogisticRegression, but fails with other models like randomforest
The model comes from sklearn, and I use your awesome tool sklearn2pmml
the error is
Exception in thread "main" org.jpmml.evaluator.EvaluationException
at org.jpmml.evaluator.CategoricalValue.compareToString(CategoricalValue.java:39)
at org.jpmml.evaluator.FieldValue.compareTo(FieldValue.java:139)
at org.jpmml.evaluator.PredicateUtil.evaluateSimplePredicate(PredicateUtil.java:131)
at org.jpmml.evaluator.PredicateUtil.evaluatePredicate(PredicateUtil.java:63)
at org.jpmml.evaluator.PredicateUtil.evaluate(PredicateUtil.java:51)
at org.jpmml.evaluator.tree.TreeModelEvaluator.evaluateNode(TreeModelEvaluator.java:201)
at org.jpmml.evaluator.tree.TreeModelEvaluator.handleTrue(TreeModelEvaluator.java:218)
at org.jpmml.evaluator.tree.TreeModelEvaluator.evaluateTree(TreeModelEvaluator.java:162)
at org.jpmml.evaluator.tree.TreeModelEvaluator.evaluateClassification(TreeModelEvaluator.java:137)
at org.jpmml.evaluator.tree.TreeModelEvaluator.evaluate(TreeModelEvaluator.java:106)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:407)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:240)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:207)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:185)
at com.ctrip.hotelbi.jpmml.Score.gettingProbability(Score.java:32)
at com.ctrip.hotelbi.jpmml.Score.gettingProbability(Score.java:53)
at com.ctrip.hotelbi.jpmml.PMMLTest.main(PMMLTest.java:41)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
The codes are
public class Process {
private String[] data;
private Evaluator evaluator;
public Process() {
}
public Process(String[] data, Evaluator evaluator) {
this.data = data;
this.evaluator = evaluator;
}
public Map<FieldName, FieldValue> testData() {
/**
* Prepare test data
* @return input data for prediction
*/
Map<FieldName, FieldValue> arguments = new LinkedHashMap<>();
List<InputField> inputs = this.evaluator.getActiveFields();
for (InputField input : inputs) {
FieldName activeName = input.getName();
int i = inputs.indexOf(input);
FieldValue activeValue = null;
try {
if (input.getDataType().equals(DataType.DOUBLE)) {
activeValue = input.prepare(Double.parseDouble(this.data[i]));
}else activeValue = FieldValueUtil.create( this.data[i] );
}catch (Exception e){
activeValue = FieldValueUtil.create(0.0);
e.printStackTrace();
}
arguments.put(activeName, activeValue);
}
return arguments;
}
}
public class Score extends Process{
private String[] data;
private Evaluator evaluator;
public Score(String[] data, Evaluator evaluator) {
super(data, evaluator);
}
public ArrayList<?> gettingProbability(Evaluator evaluator){
/**
Predict all target label probabilities
@param evaluator pmml model
@return probability score of each label
*/
Map<FieldName, FieldValue> testData = super.testData();
ArrayList<Object> score = new ArrayList();
System.out.println(testData.size());
Map<FieldName,?> finalResults = evaluator.evaluate(testData);
for(FieldName t : finalResults.keySet()){
if (finalResults.get(t) instanceof Double) {
score.add((Double) finalResults.get(t));
}else{
score.add(finalResults.get(t));
}
}
return score;
}
public Double gettingProbability(Evaluator evaluator,int targetLabelIndex){
/**
Predict target label probability
@param evaluator pmml model
@param targetLabelIndex the index of target label that you want to predict
@return probability score of each label
*/
ArrayList<?> scoreArray = this.gettingProbability(evaluator);
Double targetScore = (Double) scoreArray.get(targetLabelIndex);
return targetScore;
}
}
public class PMMLTest {
public static void main(String[] args) throws IOException, JAXBException, SAXException {
//Loading data
CSVReader reader = new CSVReader(new FileReader("d:\\Users\\shuangyangwang\\Desktop\\JPMML\\Iris1.csv"));
List<String[]> data = reader.readAll();
data.remove(0);
reader.close();
//Loading model
InputStream is = new FileInputStream("d:\\Users\\shuangyangwang\\Desktop\\Test\\ExtraTreesClassifier.pmml");
PMML model = PMMLUtil.unmarshal(is);
is.close();
ModelEvaluatorFactory mef = ModelEvaluatorFactory.newInstance();
ModelEvaluator<?> modelEvaluator = mef.newModelEvaluator(model);
Evaluator evaluator = (Evaluator) modelEvaluator;
evaluator.verify();
//Predicting probability
List<ArrayList<?>> listArray = new ArrayList<>();
for (String[] s : data) {
// PreprocessData ppd = new PreprocessData(s, evaluator);
// Map<FieldName, FieldValue> testData = ppd.testData();
Score scoreE = new Score(s, evaluator);
//ArrayList<Double> result = (ArrayList<Double>) scoreE.gettingProbability(evaluator);
Double score = scoreE.gettingProbability( evaluator ,1);
System.out.println(score);
//listArray.add(result);
}
}
}
I really don't know what is wrong with that, please give me some suggestions
Thank you very much
While trying to read the pmml file created using sklearn2pmml using jpmml evaluator for prediction facing this error:
org.jpmml.evaluator.TypeCheckException: Expected FLOAT, but got DOUBLE (3.4)
at org.jpmml.evaluator.TypeUtil.toFloat(TypeUtil.java:419)
at org.jpmml.evaluator.TypeUtil.cast(TypeUtil.java:333)
I am using the version 1.3.5 of the evaluator. PFB the mapper used while creating the pmml file no transformation was specified
iris_pipeline = PMMLPipeline([
("mapper", DataFrameMapper([
(["Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"], [ContinuousDomain(), Imputer()])
])),
("classifier", RandomForestClassifier(n_estimators = 100))
])
16/12/13 14:58:12 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, byd0158): org.jpmml.evaluator.InvalidFeatureException (at or around line 5759): Target
at org.jpmml.evaluator.IndexableUtil.ensureKey(IndexableUtil.java:81)
at org.jpmml.evaluator.IndexableUtil.buildMap(IndexableUtil.java:64)
at org.jpmml.evaluator.ModelEvaluator$7.load(ModelEvaluator.java:586)
at org.jpmml.evaluator.ModelEvaluator$7.load(ModelEvaluator.java:582)
at com.shaded.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
at com.shaded.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
at com.shaded.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
at com.shaded.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
at com.shaded.google.common.cache.LocalCache.get(LocalCache.java:3953)
at com.shaded.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
at com.shaded.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
at org.jpmml.evaluator.CacheUtil.getValue(CacheUtil.java:50)
at org.jpmml.evaluator.ModelEvaluator.(ModelEvaluator.java:139)
at org.jpmml.evaluator.MiningModelEvaluator.(MiningModelEvaluator.java:79)
at org.jpmml.evaluator.ModelEvaluatorFactory.newModelManager(ModelEvaluatorFactory.java:66)
at org.jpmml.evaluator.MiningModelEvaluator.createSegmentHandler(MiningModelEvaluator.java:559)
at org.jpmml.evaluator.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:355)
at org.jpmml.evaluator.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:223)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:190)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:167)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:162)
at org.jpmml.spark.PMMLTransformer$2.apply(PMMLTransformer.java:128)
at org.jpmml.spark.PMMLTransformer$2.apply(PMMLTransformer.java:113)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.evalExpr2$(Unknown Source)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51)
at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1858)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Hi,
I am new to PMML execution using JPMML Evaluator.
When i tried to execute clustering pmml model(KNIME) for Iris data from DMG site got the exception.
Exception in thread "main" java.lang.NullPointerException
at org.jpmml.evaluator.BatchUtil.formatRecords(BatchUtil.java:190)
at org.jpmml.evaluator.EvaluationExample.execute(EvaluationExample.java:295)
at org.jpmml.evaluator.Example.execute(Example.java:60)
at org.jpmml.evaluator.EvaluationExample.main(EvaluationExample.java:127)
I have used the below line for pmml execution in my local command prompt.
java -cp target/example-1.2-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model model.pmml --input input.tsv --output output.tsv
Please help me.
This issue is based on the following JPMML mailing list thread: https://groups.google.com/forum/#!topic/jpmml/1IsR9zTm4KY
Technically, it is possible to detect if the model contains a residual-type output field, and if so, add an extra value to the argument data record:
List<OutputField> outputFields = evaluator.getOutputFields();
for(OutputField outputField : outputFields){
if((ResultFeature.RESIDUAL).equals(outputField.getResultFeature())){
TargetField targetField = Iterables.getOnlyElement(evaluator.getTargetFields()); // Get the sole target field
arguments.put(targetField.getName(), userArguments.get(targetField.getName()));
}
}
However, this assumes great familiarity with the PMML specification and the JPMML-Evaluator way of doing things, which is an unreasonable expectation (also, the above code might not work if the residual value is calculated at some deeper model nesting level).
Hello,
My issue pertains to GeneralRegressionModelEvaluator.java and specifically to Generalized Linear Model.
If we take into consideration a two-class (here, +1 and -1) classification problem, then the generalized linear model would estimate the Pr(class = class1) and Pr(class = class2) for any data point given it's feature vector. This is done by modeling the distributions with a logit function. Since we're estimating a pmf, we will have Pr(class = class1) + Pr(class = class2) = 1.
If we look at the loop starting at line 337, it is basically supposed to do the same thing -- iterate our the different classes/categories and compute its probability. Everything goes well for class1, but when the code does the computation for class2 (which is the last category), it assigns value = 0 in line 417 and passes that through the logit function. This will always give the probability of last category to be 0.5, no matter how many categories are there.
For a two-category problem, say the probability we compute in the first iteration of the for loop starting at line 337 for category 1 is value1, then the probability of the other class should be simply (1 - value1). This is not achieved by the code. In fact it would always assign the probability for the last category to be equal to 0.5.
If I'm right, a quick fix could be that for the last category, the probability should be just 1 - sum(all the rest probabilities).
Thanks
Akshay
I am currently attempting to evaluate the a .pmml model created with sklearn2pmml. However, whenever I attempt to run the code ModelEvaluator<NearestNeighborModel> modelEvaluator = new NearestNeighborModelEvaluator(pmml);
, I get the following error:
java.lang.NullPointerException: Attempt to invoke virtual method 'boolean org.dmg.pmml.PMML.hasModels()' on a null object reference
at org.jpmml.evaluator.ModelEvaluator.selectModel(ModelEvaluator.java:584)
at org.jpmml.evaluator.nearest_neighbor.NearestNeighborModelEvaluator.<init>(NearestNeighborModelEvaluator.java:105)
at com.mygdx.game.DrawView.pitchAngle(DrawView.java:295)
at com.mygdx.game.StartGdxGame.render(StartGdxGame.java:113)
at com.badlogic.gdx.backends.android.AndroidGraphics.onDrawFrame(AndroidGraphics.java:459)
at android.opengl.GLSurfaceView$GLThread.guardedRun(GLSurfaceView.java:1522)
at android.opengl.GLSurfaceView$GLThread.run(GLSurfaceView.java:1239)
I have double checked and the code can find and has access to the .pmml-file and a model does exist in the .pmml file in the form <NearestNeighborModel functionName="regression" numberOfNeighbors="400" continuousScoringMethod="average">
.
Is there any other reason for the error? Did I maybe compile the .pmml incorrectly?
I have the following R code for generating 2 csv and 2 pmml files based on the iris dataset:
data(iris)
library(pmml)
# build a model for Sepal.Length based on remaining variables
model.glm <- glm(Sepal.Length ~ ., data=iris)
saveXML(pmml(model.glm), "iris.glm.pmml")
# write csv file for testing
write.csv(iris, 'iris.csv', quote=FALSE, row.names=FALSE)
# set remaining variables to booleans
iris$Sepal.Width <- as.logical(iris$Sepal.Width > 3)
iris$Petal.Length <- as.logical(iris$Petal.Length > 4)
iris$Petal.Width <- as.logical(iris$Petal.Width > 1)
iris$Species <- as.logical(iris$Species=='setosa')
# rebuild model for Sepal.Length
model.glm <- glm(Sepal.Length ~ ., data=iris)
saveXML(pmml(model.glm), "iris.glm.bool.pmml")
# write csv file for testing
write.csv(iris, 'iris.bool.csv', quote=FALSE, row.names=FALSE)
The problem becomes apparent when doing predictions. The files iris.csv
and iris.glm.pmml
produce the desired output. The files iris.bool.csv
and iris.glm.bool.pmml
produce the same value
for every record, regardless of the input data.
As per your inputs on above url , we have generated PMML file but output is not coming as per desire output.
PMML snippet:
output file: We are getting output(Predicted_Cluster) as 1->1 and for 2->3 and 3->2.
Please suggest on the above mention.
I do realise this might sound like a stupid question, but I am not used to java nor mvn. I've spent already more than an hour trying to install the evaluator. I've first tried mvn get with the central repository, then git cloning and mvn build, both haven't got me nowhere. Please advise, highly appreciated.
approach:
mvn org.apache.maven.plugins:maven-dependency-plugin:2.8:get -Dartifact=org.jpmml:pmml-evaluator:1.2.5:jar -DoutputDirectory=.
I've tried a lot of variants of this command searching around and looking over tutorials. But I am still not sure, where to go from here.
When I try
java -jar target/pmml-evaluator-1.2.5-sources.jar
it tells me about a missing manifest file. I've tried including this in the pom file provided in the central repository including the option -DpomFile=pom.xml, but it's complaining about the execution ids.
approach:
git clone https://github.com/jpmml/jpmml-evaluator
cd jpmml-evaluator/
mvn build pom.xml
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.jpmml:pmml-evaluator:jar:1.2-SNAPSHOT
[WARNING] 'parent.relativePath' of POM org.jpmml:jpmml-evaluator:1.2-SNAPSHOT (/Users/<>/target/jpmml-evaluator/pom.xml) points at org.jpmml:pmml-evaluator instead of org.sonatype.oss:oss-parent, please verify your project structure @ org.jpmml:jpmml-evaluator:1.2-SNAPSHOT, /Users/benjamin/target/jpmml-evaluator/pom.xml, line 5, column 10
...
[much more of this]
...
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] JPMML-Evaluator
[INFO] JPMML evaluator
[INFO] JPMML evaluator example
[INFO] JPMML KNIME integration tests
[INFO] JPMML RapidMiner integration tests
[INFO] JPMML R/Rattle integration tests
[INFO] JPMML evaluator code coverage
[INFO] JPMML extension
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building JPMML-Evaluator 1.2-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] JPMML-Evaluator .................................... FAILURE [ 0.388 s]
[INFO] JPMML evaluator .................................... SKIPPED
[INFO] JPMML evaluator example ............................ SKIPPED
[INFO] JPMML KNIME integration tests ...................... SKIPPED
[INFO] JPMML RapidMiner integration tests ................. SKIPPED
[INFO] JPMML R/Rattle integration tests ................... SKIPPED
[INFO] JPMML evaluator code coverage ...................... SKIPPED
[INFO] JPMML extension .................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.158 s
[INFO] Finished at: 2015-10-05T16:24:08+01:00
[INFO] Final Memory: 5M/65M
[INFO] ------------------------------------------------------------------------
[ERROR] Unknown lifecycle phase "pom.xml". You must specify a valid lifecycle phase or a goal in the format : or :[:]:. Available lifecycle phases are: validate, initialize, generate-sources, process-sources, generate-resources, process-resources, compile, process-classes, generate-test-sources, process-test-sources, generate-test-resources, process-test-resources, test-compile, process-test-classes, test, prepare-package, package, pre-integration-test, integration-test, post-integration-test, verify, install, deploy, pre-clean, clean, post-clean, pre-site, site, post-site, site-deploy. -> [Help 1]
...
I have 132 NeuralInputs in my PMML but evaluator.getActiveFields() method keeps giving me 100.
Is there something that is missing in my PMML.
Attached is my PMML for your reference.
Thanks,
With around ~900 input fields of type double in my model , most of the threads waste time (28% of the execution time ) in this method 'at org.jpmml.evaluator.ModelEvaluator.getInputFields(ModelEvaluator.java:207)'which is called everytime per thread execution.
Same method is called while creating the arguments per thread , I made a common inputField over there which solved that issues but again for evaluate , it is calling that method and affecting the performance.
Can we pass inputFields in evaluate method along with arguments , this could save 28% of the execution time ? This would require changing arguments everywhere.
Thread Dump:
at org.jpmml.evaluator.ModelEvaluator.createInputFields(ModelEvaluator.java:397)
at org.jpmml.evaluator.ModelEvaluator.getInputFields(ModelEvaluator.java:207)
at org.jpmml.evaluator.mining.MiningModelEvaluator.createSegmentHandler(MiningModelEvaluator.java:600)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateSegmentation(MiningModelEvaluator.java:367)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluateClassification(MiningModelEvaluator.java:240)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:207)
at org.jpmml.evaluator.mining.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:185)
Hi,
When I try to use the evaluator from a spark context, it will not create the model manager because of pmml validation problems.
Exception in thread "main" org.jpmml.evaluator.InvalidFeatureException (at or around line 8): DataDictionary [13/1951]
at org.jpmml.evaluator.CacheUtil.getValue(CacheUtil.java:58)
at org.jpmml.evaluator.ModelEvaluator.<init>(ModelEvaluator.java:113)
at org.jpmml.evaluator.TreeModelEvaluator.<init>(TreeModelEvaluator.java:54)
at org.jpmml.evaluator.ModelEvaluatorFactory.newModelManager(ModelEvaluatorFactory.java:101)
at org.jpmml.evaluator.ModelEvaluatorFactory.newModelManager(ModelEvaluatorFactory.java:45)
at org.jpmml.evaluator.ModelManagerFactory.newModelManager(ModelManagerFactory.java:66)
at org.jpmml.evaluator.ModelManagerFactory.newModelManager(ModelManagerFactory.java:46)
at com.example.Main.main(Main.java:23)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassCastException: org.dmg.pmml.DataField cannot be cast to org.dmg.pmml.Indexable
at org.jpmml.evaluator.IndexableUtil.ensureKey(IndexableUtil.java:78)
at org.jpmml.evaluator.IndexableUtil.buildMap(IndexableUtil.java:64)
at org.jpmml.evaluator.ModelEvaluator$1.load(ModelEvaluator.java:538)
at org.jpmml.evaluator.ModelEvaluator$1.load(ModelEvaluator.java:534)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
at org.jpmml.evaluator.CacheUtil.getValue(CacheUtil.java:50)
... 16 more
Here is my java class I am submitting:
package com.example;
import org.dmg.pmml.PMML;
import org.jpmml.evaluator.Evaluator;
import org.jpmml.evaluator.ModelEvaluatorFactory;
import org.jpmml.model.ImportFilter;
import org.jpmml.model.JAXBUtil;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.bind.JAXBException;
import javax.xml.transform.Source;
public final class Main {
public static void main(String[] args) {
System.out.println("hello world");
try {
Source transformedSource = ImportFilter.apply(new InputSource(Main.class.getResourceAsStream("/DecisionTreeIris.pmml")));
PMML pmml = JAXBUtil.unmarshalPMML(transformedSource);
ModelEvaluatorFactory modelEvaluatorFactory = ModelEvaluatorFactory.newInstance();
Evaluator evaluator = modelEvaluatorFactory.newModelManager(pmml);
evaluator.verify();
} catch (SAXException | JAXBException e) {
// could not parse pmml as xml
throw new RuntimeException(e);
}
}
}
Submitting with spark-submit --class com.example.Main /path/to/example-assembly.jar
.
It does not throw the error when I run the assembled jar like java -jar /path/to/example-assembly.jar
.
DecisionTreeIris.pmml is from here.
Thanks for the project. Any help is appreciated.
we have an issue in production where we have a DefineFunction in the pmml
Looking at the FunctionUtil.evaluate it will try to do some reflection stuff to find a user defined one before trying to use the one from the pmml. The problem is that reflection is a bit too slow for us in production. It would be great to either have some way to supply our own FunctionRegistry or to change the order in which the functions are resolved in FunctionUtil.
Hi @vruusmann,
Looking through the implementation of multipleModelMethod="max"
for classification, particularly: https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/ProbabilityAggregator.java#L207
Suppose we have a case with three segments that are predicting three classes and we have the following probabilities:
{a: 0.8, b: 0.1, c: 0.1},
{a: 0.5, b: 0.1, c: 0.4},
{a: 0.1, b: 0.1, c: 0.8}
Then using the max I would expect the average of the first and third model:
{a: 0.45, b: 0.1, c: 0.45}
max: consider the model(s) that have contributed the chosen probability for the winning category. Return their average probabilities;
Looking at the following from http://dmg.org/pmml/v4-3/TreeModel.html#xsdType_MISSING-VALUE-STRATEGY
missingValuePenalty:
This optional attribute of TreeModel allows computed confidences to be reduced by a specified factor each time certain kinds of missing value handling are invoked during the scoring of a case. For each Node where either surrogate rules or the defaultChild strategy had to be used to select a child, the final confidences are multiplied by this factor. Note that this is based on the number of Nodes, not on the overall number of missing values that were encountered (with operator surrogate, multiple missing values can be encountered within a single Node). For example, if two Nodes with missing values were encountered to get to the final prediction, confidence is multiplied by the two missingValuePenalty values.
It sounds like the value of missingLevels in https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/tree/TreeModelEvaluator.java should be the number of nodes that evaluate to Unknown, and nodes that rescue missing using surrogate should not count
that seems to be contrary to the logic here https://github.com/jpmml/jpmml-evaluator/blob/master/pmml-evaluator/src/main/java/org/jpmml/evaluator/tree/TreeModelEvaluator.java#L193-L195
am I reading the code wrong, or misinterpreting PMML?
Hello,
I trained two models in Knime: a Neural Network and a Decision Tree.
Im comparing the results in Knime and in Java.
When taking look at the Neural Network, Im getting the same results.
When Decision Tree Model, Im getting all observation going to false.
I tried to read de PMML Model inside Knime and the results are not getting it.
Can you help me?
my scenario is this , I train an random forest pmml file
I use multipleModelMethod=weightedAverage , and want to output label's multi class probability
the pmml file like this

then it throws an Exception
org.jpmml.evaluator.TypeCheckException: Expected org.jpmml.evaluator.HasProbability, but got org.jpmml.evaluator.ClassificationMap ({0=0.6526508348685987, 1=0.3473491651314011})
at org.jpmml.evaluator.OutputUtil.asResultFeature(OutputUtil.java:848)
at org.jpmml.evaluator.OutputUtil.getProbability(OutputUtil.java:478)
at org.jpmml.evaluator.OutputUtil.evaluate(OutputUtil.java:182)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:117)
at org.jpmml.evaluator.MiningModelEvaluator.evaluate(MiningModelEvaluator.java:85)
at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:79)
at com.alipay.mymdp.model.component.impl.pmml.engine.PmmlComponentEngine.execute(PmmlComponentEngine.java:49)
at com.alipay.mymdp.model.component.impl.pmml.engine.PmmlComponentEngine.executePmmlComponentEngine(PmmlComponentEngine.java:35)
at com.alipay.mymdp.model.component.impl.pmml.engine.TestPmmlComponentEngine.testRF2PmmlCom
People are working with models that specify tens to hundreds of THOUSANDS input fields:
Evaluator evaluator = ...;
List<InputField> inputFields = evaluator.getInputFields();
System.out.println(inputFields.size()); // Prints 100'000
For example: http://stats.stackexchange.com/questions/152891/bad-performance-of-pmml-evaluator and http://stackoverflow.com/questions/42074491/evaluate-method-takes-long-time-pmml-models-using-jpmml
Understandably, such "structurally valid but conceptually/functionally invalid" models cannot be made to perform, not by the JPMML-Evaluator library, or any other PMML scoring engine.
By default, the JPMML-Evaluator library should simply refuse to deal with them:
if(inputFields.size() > 1000){
throw new EvaluationException("The model specifies unreasonably large number of input fields, which is indicative of bad data science/engineering process. Please refactor the model");
}
However, the limit should be programmatically customizable. If people want to do stupid things, then they should have technical means to do so.
Description - Duplicate methods named spliterator with the parameters () and () are inherited from the types Collection and IterableSparseArrayUtil.java /pmml- evaluator/src/
main/java/org/jpmml/evaluator
While compiling the project eclipse, I am getting above error. Have downloaded latest code yesterday.
Hi, I'm using this library as a dependency in a maven project.
<dependency>
<groupId>org.jpmml</groupId>
<artifactId>pmml-evaluator</artifactId>
<version>1.3.3</version>
</dependency>
When I compile the project, I get
[WARNING] The POM for com.google.guava:guava:jar:19.0-SNAPSHOT is missing, no dependency information available
[WARNING] The POM for org.apache.commons:commons-math3:jar:3.5-SNAPSHOT is missing, no dependency information available
It looks like this is due to the version 'constraint' of guava for example <version>[14.0, 19.0]</version>
.
The documentation regarding version ranges tell that it is possible to take snapshots into account when resolving them. https://docs.oracle.com/middleware/1221/core/MAVEN/maven_version.htm#MAVEN8903
Why do you use version constraints and not just pick one version? And is there a way to get rid of these SNAPSHOT resolution?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.