jpmml / jpmml-python Goto Github PK
View Code? Open in Web Editor NEWJava library for converting Python models to PMML
License: GNU Affero General Public License v3.0
Java library for converting Python models to PMML
License: GNU Affero General Public License v3.0
I'm sorry for bumping this old and closed issue up, but strangely I got the same error now, which has never been occurred to me before. To make sure, I ran the LightGBM example from your blog entry. It resulted the same error.
Standard output is empty
Standard error:
Exception in thread "main" net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 0
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:366)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at com.sklearn2pmml.Main.run(Main.java:78)
at com.sklearn2pmml.Main.main(Main.java:66)
Here is my environment:
python-3.10.5.amd64
sklearn2pmml 0.86.3
scikit-learn 1.1.2
pandas 1.5.0
sklearn-pandas 2.2.0
joblib 1.2.0
numpy 1.23.3
java version "1.8.0_141"
Java(TM) SE Runtime Environment (build 1.8.0_141-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.141-b15, mixed mode)
Is there anything that I can do to investigate further? Thank you
EDIT:
Reverting joblib to 1.1.0 solves the problem.
Originally posted by @denmase in jpmml/sklearn2pmml#8 (comment)
I am using nested ifelse in expression transformer. While generating PMML I am getting below error,
'Python Expression is either invalid or not supported'
Kindly suggest how can we use nested ifelse in expression transformer.
See jpmml/sklearn2pmml#379 (comment)
Should recognize both pcre
and re
module variants.
Hello,
I'm trying to use jpmml/jpmml-python and I would like to ask a question about how to use java to implement a function based on the pandas module.
The function using pandas.DataFrame, like this:
import pandas as pd
def psi(data_dict):
data = {'actucal':data_dict['actucal'].values(), 'expected':data_dict['expected'].values()}
df = pd.DataFrame(data, index = data_dict['actucal'].keys())
df['ind'] = (df['actucal'] - df['expected']) * numpy.log(df['actucal'] / df['expected'])
psi = sum(df['ind'])
return psi
Perhaps, can I use the command-line to convert? like this:
java -jar target/jpmml-python-executable-*.jar #--{some_parameters} file_name.pkl --pmml-output file_name.pmml
Thank you in advance for taking the time to answer my question!
@vruusmann
See jpmml/sklearn2pmml#215 (February 2023 comments)
I'm attempting to build a simple LinearRegression pipeline that performs some preprocessing. The code is roughly,
X = data.filter(items=['Width'])
y = data['Weight']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
mapper = DataFrameMapper([
(["Width"], [ContinuousDomain(), FunctionTransformer(np.log)])
])
model_pipeline = PMMLPipeline([
("mapper", mapper),
("model", LinearRegression())
])
clf = model_pipeline.fit(X_train, y_train);
sklearn2pmml(clf, 'model.pmml', with_repr=True, debug=True)
However I get the following error,
python: 3.8.8
sklearn: 0.24.1
sklearn2pmml: 0.69.0
joblib: 1.0.1
sklearn_pandas: 2.1.0
pandas: 1.2.3
numpy: 1.20.2
openjdk: 1.8.0_282
Executing command:
java -cp /opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/gson-2.8.6.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/guava-21.0.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/h2o-genmodel-3.32.0.4.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/h2o-logger-3.32.0.4.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/h2o-tree-api-0.3.17.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/istack-commons-runtime-3.0.11.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jakarta.activation-1.2.2.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jakarta.xml.bind-api-2.3.3.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jaxb-runtime-2.3.3.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jcommander-1.72.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-converter-1.4.6.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-h2o-1.1.4.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-lightgbm-1.3.6.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-python-1.0.11.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.6.15.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.5.0.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/pickle-1.1.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/pmml-model-1.5.11.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/pmml-model-metro-1.5.11.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/serpent-1.30.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/slf4j-api-1.7.30.jar:/opt/conda/lib/python3.8/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.30.jar org.jpmml.sklearn.Main --pkl-pipeline-input /tmp/pipeline-q4tmq2kn.pkl.z --pmml-output fish-weight-model.pmml
Standard output is empty
Standard error:
Apr 05, 2021 5:09:27 AM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Apr 05, 2021 5:09:27 AM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 19 ms.
Apr 05, 2021 5:09:27 AM org.jpmml.sklearn.Main run
INFO: Converting PKL to PMML..
Apr 05, 2021 5:09:27 AM org.jpmml.sklearn.Main run
SEVERE: Failed to convert PKL to PMML
java.lang.IllegalArgumentException: Attribute 'sklearn.preprocessing._function_transformer.FunctionTransformer.func' has an unsupported value (Python class numpy.core._multiarray_umath.log)
at org.jpmml.python.CastFunction.apply(CastFunction.java:45)
at org.jpmml.python.PythonObject.get(PythonObject.java:91)
at org.jpmml.python.PythonObject.getOptional(PythonObject.java:101)
at sklearn.preprocessing.FunctionTransformer.getFunc(FunctionTransformer.java:68)
at sklearn.preprocessing.FunctionTransformer.encodeFeatures(FunctionTransformer.java:44)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
at sklearn.Initializer.encodeFeatures(Initializer.java:48)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:212)
at org.jpmml.sklearn.Main.run(Main.java:233)
at org.jpmml.sklearn.Main.main(Main.java:151)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDictConstructor to org.jpmml.python.Identifiable
at java.lang.Class.cast(Class.java:3369)
at org.jpmml.python.CastFunction.apply(CastFunction.java:43)
... 12 more
Exception in thread "main" java.lang.IllegalArgumentException: Attribute 'sklearn.preprocessing._function_transformer.FunctionTransformer.func' has an unsupported value (Python class numpy.core._multiarray_umath.log)
at org.jpmml.python.CastFunction.apply(CastFunction.java:45)
at org.jpmml.python.PythonObject.get(PythonObject.java:91)
at org.jpmml.python.PythonObject.getOptional(PythonObject.java:101)
at sklearn.preprocessing.FunctionTransformer.getFunc(FunctionTransformer.java:68)
at sklearn.preprocessing.FunctionTransformer.encodeFeatures(FunctionTransformer.java:44)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
at sklearn.Initializer.encodeFeatures(Initializer.java:48)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:212)
at org.jpmml.sklearn.Main.run(Main.java:233)
at org.jpmml.sklearn.Main.main(Main.java:151)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDictConstructor to org.jpmml.python.Identifiable
at java.lang.Class.cast(Class.java:3369)
at org.jpmml.python.CastFunction.apply(CastFunction.java:43)
... 12 more
Any help would be appreciated.
I am unable to create a pmml extract of my sklearn pipeline.
This is how I create my pipeline:
from category_encoders import OrdinalEncoder
from xgboost import XGBClassifier
import sklearn2pmml
cat_vars = ['X1', 'X2', .. ]
categorical_transformer = Pipeline(steps=[
('WOEEnc', OrdinalEncoder(handle_missing='missing'))])
preprocessor = ColumnTransformer(remainder='passthrough',
transformers=[
('cat', categorical_transformer, cat_vars)])
clf = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', XGBClassifier(**params))])
clf.fit(X, y)
sklearn2pmml.sklearn2pmml(sklearn2pmml.make_pmml_pipeline(clf), 'Exp_model.pmml', debug=True)
Stacktrace:
python: 3.7.9
sklearn: 0.24.1
sklearn2pmml: 0.69.0
joblib: 0.17.0
sklearn_pandas: 2.1.0
pandas: 1.2.1
numpy: 1.20.0
openjdk: 13.0.1
Executing command:
java -cp /Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/gson-2.8.6.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/guava-21.0.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/h2o-genmodel-3.32.0.4.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/h2o-logger-3.32.0.4.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/h2o-tree-api-0.3.17.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/istack-commons-runtime-3.0.11.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jakarta.activation-1.2.2.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jakarta.xml.bind-api-2.3.3.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jaxb-runtime-2.3.3.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jcommander-1.72.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-converter-1.4.6.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-h2o-1.1.4.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-lightgbm-1.3.6.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-python-1.0.11.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.6.15.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.5.0.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/pickle-1.1.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/pmml-model-1.5.11.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/pmml-model-metro-1.5.11.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/serpent-1.30.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/slf4j-api-1.7.30.jar:/Users/myUser//anaconda3/envs/dev/lib/python3.7/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.30.jar org.jpmml.sklearn.Main --pkl-pipeline-input /var/folders/kr/730vknb91nz9yr07hdyvwvlm0000gn/T/pipeline-1f8er7vh.pkl.z --pmml-output Documents/IBM/Projects/EID/Exp_model_joblib.pmml
Standard output is empty
Standard error:
Mar 26, 2021 9:48:37 AM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Mar 26, 2021 9:48:37 AM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 120 ms.
Mar 26, 2021 9:48:37 AM org.jpmml.sklearn.Main run
INFO: Converting PKL to PMML..
Mar 26, 2021 9:48:37 AM sklearn2pmml.pipeline.PMMLPipeline initTargetFields
WARNING: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.target_fields' is not set. Assuming y as the name of the target field
Mar 26, 2021 9:48:37 AM org.jpmml.sklearn.Main run
SEVERE: Failed to convert PKL to PMML
java.lang.IllegalArgumentException: Attribute 'pandas.core.series.Series._data' not set
at org.jpmml.python.PythonObject.get(PythonObject.java:69)
at pandas.core.Series.getData(Series.java:30)
at category_encoders.OrdinalEncoder.getCategoryMapping(OrdinalEncoder.java:115)
at category_encoders.OrdinalEncoder.encodeFeatures(OrdinalEncoder.java:62)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn.pipeline.PipelineTransformer.encodeFeatures(PipelineTransformer.java:65)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.compose.ColumnTransformer.encodeFeatures(ColumnTransformer.java:63)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:212)
at org.jpmml.sklearn.Main.run(Main.java:233)
at org.jpmml.sklearn.Main.main(Main.java:151)
Exception in thread "main" java.lang.IllegalArgumentException: Attribute 'pandas.core.series.Series._data' not set
at org.jpmml.python.PythonObject.get(PythonObject.java:69)
at pandas.core.Series.getData(Series.java:30)
at category_encoders.OrdinalEncoder.getCategoryMapping(OrdinalEncoder.java:115)
at category_encoders.OrdinalEncoder.encodeFeatures(OrdinalEncoder.java:62)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn.pipeline.PipelineTransformer.encodeFeatures(PipelineTransformer.java:65)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.compose.ColumnTransformer.encodeFeatures(ColumnTransformer.java:63)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:212)
at org.jpmml.sklearn.Main.run(Main.java:233)
at org.jpmml.sklearn.Main.main(Main.java:151)
I'm writing a feature generation preprocessing step which checks if a value is close to being a round factor of 10, using the ExpressionTransformer
feature_5 = DataFrameMapper([
(["Amount_Usd"],
[Alias(ExpressionTransformer("1 if ((X[0] % 10.) <= 0.1) | ((X[0] % 10.) >= 9.9) else 0"),
name="feature_5", prefit=True)],)
])
feature_5
DataFrameMapper(drop_cols=[],
features=[(['Amount_Usd'],
[Alias(name='feature_5', prefit=True,
transformer=ExpressionTransformer(expr='1 if '
'((X[0] '
'% '
'10.) '
'<= '
'0.1) '
'| '
'((X[0] '
'% '
'10.) '
'>= '
'9.9) '
'else '
'0'))])])
The expression is valid python, but there seems to be an issue with the translation of | to LogicalOr?
> SEVERE: Failed to convert PKL to PMML
org.jpmml.python.TokenMgrException: Lexical error at line 1, column 28. Encountered: "|" (124), after : ""
at org.jpmml.python.ExpressionTranslatorTokenManager.getNextToken(ExpressionTranslatorTokenManager.java:619)
at org.jpmml.python.ExpressionTranslator.jj_scan_token(ExpressionTranslator.java:1967)
at org.jpmml.python.ExpressionTranslator.jj_3R_TrailerFunctionInvocationExpression_745_9_59(ExpressionTranslator.java:1182)
at org.jpmml.python.ExpressionTranslator.jj_3R_PrimaryExpression_634_58_44(ExpressionTranslator.java:1386)
at org.jpmml.python.ExpressionTranslator.jj_3R_PrimaryExpression_634_58_35(ExpressionTranslator.java:1377)
at org.jpmml.python.ExpressionTranslator.jj_3R_PrimaryExpression_634_17_28(ExpressionTranslator.java:1581)
at org.jpmml.python.ExpressionTranslator.jj_3R_PrimaryExpression_623_9_26(ExpressionTranslator.java:1633)
at org.jpmml.python.ExpressionTranslator.jj_3R_UnaryExpression_605_17_25(ExpressionTranslator.java:1651)
at org.jpmml.python.ExpressionTranslator.jj_3R_UnaryExpression_600_9_20(ExpressionTranslator.java:1697)
at org.jpmml.python.ExpressionTranslator.jj_3R_MultiplicativeExpression_587_9_15(ExpressionTranslator.java:1732)
at org.jpmml.python.ExpressionTranslator.jj_3R_AdditiveExpression_563_9_12(ExpressionTranslator.java:1124)
at org.jpmml.python.ExpressionTranslator.jj_3_1(ExpressionTranslator.java:1174)
at org.jpmml.python.ExpressionTranslator.jj_2_1(ExpressionTranslator.java:1069)
at org.jpmml.python.ExpressionTranslator.ComparisonExpression(ExpressionTranslator.java:398)
at org.jpmml.python.ExpressionTranslator.NegationExpression(ExpressionTranslator.java:387)
at org.jpmml.python.ExpressionTranslator.LogicalAndExpression(ExpressionTranslator.java:357)
at org.jpmml.python.ExpressionTranslator.LogicalOrExpression(ExpressionTranslator.java:336)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:317)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:310)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:321)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:310)
at org.jpmml.python.ExpressionTranslator.translateExpressionInternal(ExpressionTranslator.java:304)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:34)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:23)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:52)
at sklearn2pmml.decoration.Alias.encodeFeatures(Alias.java:56)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
at sklearn.Initializer.encodeFeatures(Initializer.java:48)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.pipeline.FeatureUnion.encodeFeatures(FeatureUnion.java:45)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:211)
at org.jpmml.sklearn.Main.run(Main.java:226)
at org.jpmml.sklearn.Main.main(Main.java:143)
Hello Villu,
I am having problems with the sklearn2pmml conversion
Standard output is empty
Standard error:
Exception in thread "main" java.lang.IllegalArgumentException: Function 'builtins.int' is not supported
at org.jpmml.python.FunctionUtil.encodePythonFunction(FunctionUtil.java:103)
at org.jpmml.python.FunctionUtil.encodeFunction(FunctionUtil.java:72)
at org.jpmml.python.ExpressionTranslator.translateFunction(ExpressionTranslator.java:186)
at org.jpmml.python.ExpressionTranslator.FunctionInvocationExpression(ExpressionTranslator.java:849)
at org.jpmml.python.ExpressionTranslator.PrimaryExpression(ExpressionTranslator.java:646)
at org.jpmml.python.ExpressionTranslator.UnaryExpression(ExpressionTranslator.java:594)
at org.jpmml.python.ExpressionTranslator.MultiplicativeExpression(ExpressionTranslator.java:539)
at org.jpmml.python.ExpressionTranslator.AdditiveExpression(ExpressionTranslator.java:495)
at org.jpmml.python.ExpressionTranslator.ComparisonExpression(ExpressionTranslator.java:435)
at org.jpmml.python.ExpressionTranslator.NegationExpression(ExpressionTranslator.java:390)
at org.jpmml.python.ExpressionTranslator.LogicalAndExpression(ExpressionTranslator.java:373)
at org.jpmml.python.ExpressionTranslator.LogicalOrExpression(ExpressionTranslator.java:339)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:320)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:313)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:324)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:313)
at org.jpmml.python.ExpressionTranslator.translateExpressionInternal(ExpressionTranslator.java:307)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:33)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:22)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:73)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn.pipeline.PipelineTransformer.encodeFeatures(PipelineTransformer.java:65)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.pipeline.FeatureUnion.encodeFeatures(FeatureUnion.java:45)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn_pandas.DataFrameMapper.encodeFeatures(DataFrameMapper.java:67)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:153)
at com.sklearn2pmml.Main.run(Main.java:91)
at com.sklearn2pmml.Main.main(Main.java:66)
It seems that the int in the following code is not being used correctly
def make_modify_date_pipeline():
return make_pipeline(ExpressionTransformer("X[0][:4] + '-' + X[0][4:6] + '-' + X[0][6:8] if len(X[0]) > 0 and int(X[0][0:8]) < 20221230 else '2022-12-30'"), CastTransformer(dtype = "datetime64[D]"), DaysSinceYearTransformer(year = 2022))
Of course, we've talked about this before, and you give tips for better CastTransformer
you should be using the good old CastTransformer instead.
I have upgraded to the latest sklearn2pmml version. What you mean is to change the sklearn version? (this will be an impossible operation, because I am working on the company's notebook and it is not allowed to change the sklearn version!).
Fetch, is there any other form to complete this operation? The reason why I write this is because I cannot compare str with int, so int is needed. If it is pure Python, I have many ways to solve it, but in pipeline, I don't know how to handle it!
Hi ,
I want to have a feature based on the length of my data
in pandas it is something like this:
data['feature_length'] = data['feature'].apply(lambda a: len(a))==19)]
in the sklear2pandas i used the bellow code
recorder.features = recorder.features + [(
[feature],
[
# CastTransformer(str),
CategoricalDomain(dtype=str),
SimpleImputer(missing_values=np.nan, strategy='constant', fill_value='Miss'),
# SubstringTransformer(15, 20),
Alias(ExpressionTransformer("0 if len(X[0]) ==19 else 1"),name='feature_Len'),
Alias(CastTransformer(int), name="feature_Len")
], {'alias': "ki_hfcustomerext_mobileappretail_ach_vset_ne_"}
)]
and got an error
SEVERE: Failed to convert PKL to PMML
java.lang.IllegalArgumentException: len
at org.jpmml.python.ExpressionTranslator.translateFunction(ExpressionTranslator.java:158)
at org.jpmml.python.ExpressionTranslator.FunctionInvocationExpression(ExpressionTranslator.java:634)
at org.jpmml.python.ExpressionTranslator.PrimaryExpression(ExpressionTranslator.java:533)
what is the command i can use to fix this error?
Hello Villu,
How can I use a scipy tools in ExpressionTransformer? Or where can I read about libraries which are support ExpressionTransformer
('feature', ExpressionTransformer('scipy.special.logit(X[0])'))
PMML said "No module name scipy". Version 0.74.1
(I know, that I can use numpy.log(X[0]/(1-X[0]))
, but maybe scipy will work for me?)
how *.pkl file to pmml
Inspiration: jpmml/jpmml-evaluator#193 (comment)
Hi, I have a scanrio where I need to use an array as a input column to my pipeline.
I'd reduced a minimal example of the issue I'm having:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
from sklearn.compose import ColumnTransformer
from sklearn2pmml.preprocessing import ExpressionTransformer
df = pd.DataFrame({'c1': [1, 2, 3], 'c2': [[1,2], [1,2], [3,1]]})
pipeline = make_pipeline(
ColumnTransformer(
transformers=[
(f'get_item_0_from_c2_array', ExpressionTransformer('X["c2"][0]'), ['c2'])
]
),
LogisticRegression(),
)
pipeline.fit(df, [0, 0, 1])
pipeline.predict(df)
The above pipeline works fine in my jupyter notebook. But converting it to a PMML gives an error:
import sklearn2pmml
pmml_pipeline = sklearn2pmml.PMMLPipeline(steps=[
('pipeline',pipeline)
])
sklearn2pmml.sklearn2pmml(pmml_pipeline, './pipeline.pmml', debug=True)
Gives the error:
java.lang.IllegalArgumentException: Python expression 'X["c2"][0]' is either invalid or not supported
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:36)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:23)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:51)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.compose.ColumnTransformer.encodeFeatures(ColumnTransformer.java:63)
at sklearn.Transformer.encode(Transformer.java:70)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn.Composite.encodeModel(Composite.java:135)
at sklearn.pipeline.PipelineClassifier.encodeModel(PipelineClassifier.java:86)
at sklearn.Estimator.encode(Estimator.java:103)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:233)
at org.jpmml.sklearn.Main.run(Main.java:217)
at org.jpmml.sklearn.Main.main(Main.java:143)
Caused by: org.jpmml.python.ParseException: Encountered unexpected token: "]" "]"
at line 1, column 10.
Was expecting one of:
":"
at org.jpmml.python.ExpressionTranslator.generateParseException(ExpressionTranslator.java:2110)
at org.jpmml.python.ExpressionTranslator.jj_consume_token(ExpressionTranslator.java:1973)
at org.jpmml.python.ExpressionTranslator.StringSlicingExpression(ExpressionTranslator.java:956)
at org.jpmml.python.ExpressionTranslator.PrimaryExpression(ExpressionTranslator.java:637)
at org.jpmml.python.ExpressionTranslator.UnaryExpression(ExpressionTranslator.java:597)
at org.jpmml.python.ExpressionTranslator.MultiplicativeExpression(ExpressionTranslator.java:538)
at org.jpmml.python.ExpressionTranslator.AdditiveExpression(ExpressionTranslator.java:494)
at org.jpmml.python.ExpressionTranslator.ComparisonExpression(ExpressionTranslator.java:434)
at org.jpmml.python.ExpressionTranslator.NegationExpression(ExpressionTranslator.java:389)
at org.jpmml.python.ExpressionTranslator.LogicalAndExpression(ExpressionTranslator.java:359)
at org.jpmml.python.ExpressionTranslator.LogicalOrExpression(ExpressionTranslator.java:338)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:319)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:312)
at org.jpmml.python.ExpressionTranslator.translateExpressionInternal(ExpressionTranslator.java:306)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:34)
... 12 more
Hello Villu,
Not sure if this is an issue per se, but it seems like following statement is not allowed for PMML conversion:
ExpressionTransformer("X[1][0:X[0]]") - Where X[1] is a string and X[0] is an integer I want to use to slice X[1].
Is there any way to achieve this with the current functionality? I have tested different options and they all seem to not work for one reason or another.
The expression works fine when testing in Python, however, when using sklearn2pmml.sklearn2pmml it throws an error. Please find an example of the code:
prep_pipe = pipeline.Pipeline([
# Previous transformations
, ("trim_string", proc.ExpressionTransformer("X[1][0:X[0]]"))
])
fitted_pipe = prep_pipe.fit(df[column_list])
fitted_pipe_pmml = sklearn2pmml.make_pmml_pipeline(fitted_pipe)
sklearn2pmml.sklearn2pmml(fitted_pipe_pmml, path)
This gives the following error:
Exception in thread "main" java.lang.IllegalArgumentException: Python expression 'X[1][0:X[0]]' is either invalid or not supported
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:35)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:22)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:73)
at sklearn.Transformer.encode(Transformer.java:69)
at sklearn.Composite.encodeFeatures(Composite.java:119)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:212)
at com.sklearn2pmml.Main.run(Main.java:84)
at com.sklearn2pmml.Main.main(Main.java:62)
Caused by: org.jpmml.python.ParseException: Encountered unexpected token: "X" <NAME>
at line 1, column 8.
Was expecting one of:
"+"
"-"
"]"
<INT>
at org.jpmml.python.ExpressionTranslator.generateParseException(ExpressionTranslator.java:2152)
at org.jpmml.python.ExpressionTranslator.jj_consume_token(ExpressionTranslator.java:2015)
at org.jpmml.python.ExpressionTranslator.StringSlicingExpression(ExpressionTranslator.java:958)
at org.jpmml.python.ExpressionTranslator.PrimaryExpression(ExpressionTranslator.java:628)
at org.jpmml.python.ExpressionTranslator.UnaryExpression(ExpressionTranslator.java:588)
at org.jpmml.python.ExpressionTranslator.MultiplicativeExpression(ExpressionTranslator.java:529)
at org.jpmml.python.ExpressionTranslator.AdditiveExpression(ExpressionTranslator.java:485)
at org.jpmml.python.ExpressionTranslator.ComparisonExpression(ExpressionTranslator.java:425)
at org.jpmml.python.ExpressionTranslator.NegationExpression(ExpressionTranslator.java:380)
at org.jpmml.python.ExpressionTranslator.LogicalAndExpression(ExpressionTranslator.java:350)
at org.jpmml.python.ExpressionTranslator.LogicalOrExpression(ExpressionTranslator.java:329)
at org.jpmml.python.ExpressionTranslator.IfElseExpression(ExpressionTranslator.java:310)
at org.jpmml.python.ExpressionTranslator.Expression(ExpressionTranslator.java:303)
at org.jpmml.python.ExpressionTranslator.translateExpressionInternal(ExpressionTranslator.java:297)
at org.jpmml.python.ExpressionTranslator.translate(ExpressionTranslator.java:33)
7 more
Thank you!
Hi Villu.
I notice in UFuncUtil.java that PMMLFunctions.POW already exists and is used for square.
Is it now possible to pass 2 params and implement this?
Hi,
I am creating a DataFramemapper in which for one of the columns ExpressionTransformer was used. So, I want to fill value based on multiple conditional statements and for that, I want to write if elif statements.
Currently, I am using the following syntax but sklearn2pmml throwing error.
ExpressionTransformer('100*X[1]/X[0] if X[0]>0 and X[0]<=90 and X[1]>0 and X[1]<=90 else X[1]+900 if X[1]>90 else 0')
Below is the error-
/usr/lib/python3.8/subprocess.py:848: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
self.stdout = io.open(c2pread, 'rb', bufsize)
/usr/lib/python3.8/subprocess.py:853: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
self.stderr = io.open(errread, 'rb', bufsize)
Standard output is empty
Standard error:
Mar 10, 2022 12:44:11 PM org.jpmml.sklearn.Main run
INFO: Parsing PKL..
Mar 10, 2022 12:44:12 PM org.jpmml.sklearn.Main run
INFO: Parsed PKL in 100 ms.
Mar 10, 2022 12:44:12 PM org.jpmml.sklearn.Main run
INFO: Converting..
Mar 10, 2022 12:44:12 PM org.jpmml.sklearn.Main run
SEVERE: Failed to convert
java.lang.IllegalArgumentException: Python expression '100*X[1]/X[0] if X[0]>0 and X[0]<=90 and X[1]>0 and X[1]<=90 else X[1]+900 if X[1]>90 else 0' is either invalid or not supported
at org.jpmml.sklearn.ExpressionTranslator.translate(ExpressionTranslator.java:76)
at org.jpmml.sklearn.ExpressionTranslator.translate(ExpressionTranslator.java:63)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:47)
at sklearn2pmml.decoration.Alias.encodeFeatures(Alias.java:56)
at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
at sklearn.Initializer.encodeFeatures(Initializer.java:44)
at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
at sklearn.Composite.encodeFeatures(Composite.java:129)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:209)
at org.jpmml.sklearn.Main.run(Main.java:228)
at org.jpmml.sklearn.Main.main(Main.java:148)
Caused by: org.jpmml.sklearn.ParseException: Encountered unexpected token: "if" "if"
at line 1, column 76.
Was expecting one of:
"!="
"%"
"*"
"+"
"-"
"/"
"<"
"<="
"=="
">"
">="
"and"
"or"
<EOF>
at org.jpmml.sklearn.ExpressionTranslator.generateParseException(ExpressionTranslator.java:1558)
at org.jpmml.sklearn.ExpressionTranslator.jj_consume_token(ExpressionTranslator.java:1426)
at org.jpmml.sklearn.ExpressionTranslator.translateExpressionInternal(ExpressionTranslator.java:215)
at org.jpmml.sklearn.ExpressionTranslator.translate(ExpressionTranslator.java:74)
... 11 more
Exception in thread "main" java.lang.IllegalArgumentException: Python expression '100*X[1]/X[0] if X[0]>0 and X[0]<=90 and X[1]>0 and X[1]<=90 else X[1]+900 if X[1]>90 else 0' is either invalid or not supported
at org.jpmml.sklearn.ExpressionTranslator.translate(ExpressionTranslator.java:76)
at org.jpmml.sklearn.ExpressionTranslator.translate(ExpressionTranslator.java:63)
at sklearn2pmml.preprocessing.ExpressionTransformer.encodeFeatures(ExpressionTransformer.java:47)
at sklearn2pmml.decoration.Alias.encodeFeatures(Alias.java:56)
at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
at sklearn_pandas.DataFrameMapper.initializeFeatures(DataFrameMapper.java:73)
at sklearn.Initializer.encodeFeatures(Initializer.java:44)
at sklearn.Transformer.updateAndEncodeFeatures(Transformer.java:118)
at sklearn.Composite.encodeFeatures(Composite.java:129)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:209)
at org.jpmml.sklearn.Main.run(Main.java:228)
at org.jpmml.sklearn.Main.main(Main.java:148)
Caused by: org.jpmml.sklearn.ParseException: Encountered unexpected token: "if" "if"
at line 1, column 76.
Was expecting one of:
"!="
"%"
"*"
"+"
"-"
"/"
"<"
"<="
"=="
">"
">="
"and"
"or"
<EOF>```
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.