Comments (1)
@krzyslom I think FSelectorRcpp
completly removes rows with NAs
. Can you provide a summary of behaviour for FSelector
in this case?
From FSelector:::information.gain.body
-> FSelector:::discretize.all
-> FSelector:::supervised.discretization
I see
function (formula, data)
{
data = get.data.frame.from.formula(formula, data)
complete = complete.cases(data[[1]])
all.complete = all(complete)
if (!all.complete) {
new_data = data[complete, , drop = FALSE]
result = Discretize(formula, data = new_data, na.action = na.pass)
return(result)
}
else {
return(Discretize(formula, data = data, na.action = na.pass))
}
}
<environment: namespace:FSelector>
That FSelector
removes only rows where NA
is in the dependent variable.
So the only thing is to check how does FSelector
(by the interface to RWeka::Dicretize` deals with NAs in the explanatory variables
> RWeka::Discretize
An R interface to Weka class 'weka.filters.supervised.attribute.Discretize', which has
information
An instance filter that discretizes a range of numeric attributes in the dataset
into nominal attributes. Discretization is by Fayyad & Irani's MDL method (the
default).
For more information, see:
Usama M. Fayyad, Keki B. Irani: Multi-interval discretization of continuousvalued
attributes for classification learning. In: Thirteenth International Joint
Conference on Articial Intelligence, 1022-1027, 1993.
Igor Kononenko: On Biases in Estimating Multi-Valued Attributes. In: 14th
International Joint Conference on Articial Intelligence, 1034-1040, 1995.
BibTeX:
@INPROCEEDINGS{Fayyad1993,
publisher = {Morgan Kaufmann Publishers},
year = {1993},
pages = {1022-1027},
author = {Usama M. Fayyad and Keki B. Irani},
title = {Multi-interval discretization of continuousvalued attributes for
classification learning},
volume = {2},
booktitle = {Thirteenth International Joint Conference on Articial Intelligence},
}
@INPROCEEDINGS{Kononenko1995,
year = {1995},
pages = {1034-1040},
PS = {http://ai.fri.uni-lj.si/papers/kononenko95-ijcai.ps.gz},
author = {Igor Kononenko},
title = {On Biases in Estimating Multi-Valued Attributes},
booktitle = {14th International Joint Conference on Articial Intelligence},
}
Argument list:
x(formula, data, subset, na.action, control = NULL)
Returns objects inheriting from classes:
Discretize data.frame
from fselectorrcpp.
Related Issues (20)
- Segfault during covr
- 100% coverage HOT 5
- feature_search output HOT 2
- R^2 example in feature search. HOT 1
- Discretize warning HOT 1
- Enable dependent variable discretization the same as FSelector:::equal.frequency.binning.discretization HOT 14
- cutOff_k description HOT 1
- Bug when list interface used inside function HOT 1
- Get back to CRAN HOT 4
- FSelector:::information.gain.body using FSelectorRcpp:::information_gain - become a part of the FSelector HOT 1
- Typos in the Movitation vignette HOT 2
- Add `integer2numeric` to `information_gain` and `discretize` functions. HOT 1
- Compare similarities and differences with FSelector and FSelecotrRcpp in a vignette
- Filter request: Relief HOT 24
- Installation bug: "Error in if (nzchar(SHLIB_LIBADD)) SHLIB_LIBADD else character(): argument is of length zero" HOT 5
- RWeka::Discretize works and produces correct results while FSelectorRcpp::discretize issues many "Cannot find any split points for `Col_XXX`. Drops this column." warnings and incorrect results HOT 1
- Segfault in information_gain HOT 10
- solved
- Information gain equation in the documentation. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fselectorrcpp.