Comments (5)
if I have a matrix of features in which the features names contains some particular characters (such as &) the package throws an exception connected to RExpParser.
Can you paste the full stack trace of this exception here?
Better yet, can you provide a reproducible example (a toy dataset and an R script) that I could play with?
from jpmml-r.
Here you have the output of the command.
As soon as possible I will give you the precise example.
D:\jpmml-r-master>java -Xms4G -Xmx16G -jar target/converter-executable-1.2-SNAPSHOT.jar --rds-input LibSVMAnomalyFormulaReq.rds --pmml-output model.pmml
set 19, 2017 4:59:39 PM org.jpmml.rexp.Main run
INFORMAZIONI: Parsing RDS..
Exception in thread "main" java.lang.StackOverflowError
at java.io.DataInputStream.readInt(Unknown Source)
at org.jpmml.rexp.XDRInput.readInt(XDRInput.java:62)
at org.jpmml.rexp.RExpParser.readInt(RExpParser.java:481)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:67)
at org.jpmml.rexp.RExpParser.readPairList(RExpParser.java:155)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:74)
at org.jpmml.rexp.RExpParser.readFunctionCall(RExpParser.java:218)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:82)
at org.jpmml.rexp.RExpParser.readPairList(RExpParser.java:155)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:74)
at org.jpmml.rexp.RExpParser.readFunctionCall(RExpParser.java:218)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:82)
at org.jpmml.rexp.RExpParser.readPairList(RExpParser.java:155)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:74)
at org.jpmml.rexp.RExpParser.readFunctionCall(RExpParser.java:218)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:82)
at org.jpmml.rexp.RExpParser.readPairList(RExpParser.java:155)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:74)
at org.jpmml.rexp.RExpParser.readFunctionCall(RExpParser.java:218)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:82)
at org.jpmml.rexp.RExpParser.readPairList(RExpParser.java:155)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:74)
at org.jpmml.rexp.RExpParser.readFunctionCall(RExpParser.java:218)
at org.jpmml.rexp.RExpParser.readRExp(RExpParser.java:82)
from jpmml-r.
Very interesting - the RDS parser component appears to go into infinite loop.
A reproducible example would be much appreciated. Can you share your LibSVMAnomalyFormulaReq.rds
RDS file, which is very nicely broken?
In your R script, can you temporarily work around this issue by escaping variable names? For example, try surrounding them with backticks as suggested here:
https://stackoverflow.com/questions/3574385/can-i-escape-characters-in-variable-names
from jpmml-r.
Ok, I will try to explain myself better.
Unfortunately I cannot send you the data (for privacy reasons).
I will try to build a toy model with the same errors.
What I can tell you is that the names of the features contains 4-grams of apache logs (so something like "GET ", "ET /", "T /g" and so on...).
I'm trying to do anomaly detection on the requests so I'm building a One-Class SVM (both in R and Python).
When I use Python there are no problems with the variable names while in R I had to use the following trick: I changed all the variables names to "X1X", "X2X", "X3X" and so on. This fixed the problem and the jpmml-r package performed correctly the conversion rds --> pmml. Then I changed again the variable names in the pmml file taking into account that "&" --> "&". This created the correct model and the results agreed with the Python one.
Here I have another question: I'm trying to use the pmmls created inside a scala program. While the results from R and Python agrees (as I said before), the results from the scala One-Class SVM model are quite different? Have you any ideas about this? Could this be an issue with scala (i'm thinking about machine precision) or something with the One-Class SVM (and libsvm)?
Thanks for your time.
Best,
Simon
from jpmml-r.
The PMML standard (and the JPMML implementation of it) does not have a concept of reserved symbols/keywords. For example, the string &
would be a perfectly acceptable field name. There is no need of escaping it as \&
or &
- honey badger don't care.
The problem is specific to the R platform, because R has the concept of reserved symbols/keywords. The problem would probably be resolved by escaping variable names properly - did you try using backticks as suggested above? It is no wonder that the RDS parser gets confused when the RDS model file contains incorrect RDS strings. Sure, it would be nice if the RDS parser would be able to detect and recover in such a situation, but you as an R end user can prevent this situation from happening in the first place.
from jpmml-r.
Related Issues (14)
- jpmml-r .jar file issue HOT 4
- Schema hardening - auto generate missing value replacement attribute
- R pmml from glm scorecard HOT 1
- Add support for the tweedie distribution in GLM models HOT 2
- "The value for field Species is not defined" when i use xgboost comes to this problem,anyone can help? HOT 1
- Which versions of R are supported? HOT 2
- Dropped Accuracy HOT 4
- ClassCastException while converting rds to pmml HOT 3
- Check compatibility with latest `ranger` package versions
- Random forest survival (ranger)
- Support for decision engineering
- Support for type cast functions
- Support for date/datetime parsing functions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jpmml-r.