Comments (24)
@krzyslom can you look at this? We can start by porting this as is, and then gradually replacing parts with the C++ code.
from fselectorrcpp.
@zzawadz I'll investigate it next weekend.
from fselectorrcpp.
I promise I'll loot at it. I think that I'll start from copying the whole function from FSelector and start replacing its parts with C++ code.
from fselectorrcpp.
@pat-s it's on CRAN now.
from fselectorrcpp.
any update here? :)
from fselectorrcpp.
Hey @pat-s
I took a look at the function a while ago. Here are my thoughts.
- Controversial silent behavior e.g.
if (sample.size < 1) {
warning(paste("Assumed: sample.size = ", sample.size))
sample.size = 1
sample_instances_idx = sample(1:instances_count, 1)
}
The function prints passed value while calling it assumed. Next the value is silently changed. This behavior occurs in several steps. Moreover we can pass double
or a negative number to sample.size
which does not make sense. Perhaps it is better to throw an error is such cases.
-
There are several not exported functions that also have to be implemented in the package e.g.
get.data.frame.from.formula
andnormalize.min.max
. -
relief
function is composed from several inner functions. Results of their call are assigned to global variables with<<-
operator. The question is should we just try to make a copy-paste for this functionality, or should we try to redesign the function with future C++ interface in mind. Unfortunately I am not familiar with C++ so I do not have a good judgement in that case. I'm not sure if @zzawadz would write Rcpp implementation in similar fashion.
@zzawadz please take a look at the topic.
from fselectorrcpp.
Sorry for the late follow-up.
I have no detailed knowledge about the algorithm but we would like to use a Rcpp version of it in both {mlr} and {mlr3filters}.
I just searched again and there is currently no other package implementing the RELIEF algorithm in R.
Compared to other filters, the java-based {FSelector} implementation takes really long. It would be a good enhancement if we could port this to {FSelectorRcpp}. This filter method is still able to achieve good results in benchmarks and I really think the effort would be worth it.
Sometimes a rewrite is even easier than porting existing code - but you are the experts here, I have no clue about the internals and know only very few about Rcpp.
@zzawadz Did you have time to look at all of this in the meantime?
from fselectorrcpp.
CC @zzawadz
from fselectorrcpp.
bump :)
from fselectorrcpp.
My plan is like this:
- contact author of this function and get approval for adapting his code? (do i need this? it's open source so maybe just mentioning the author will be sufficient - I have spent too much time in BigCo to be sure).
- Adapt the code to use more FSelectorRcpp's like interfaces.
- Remove
<<-
assignment (It's hard for me to reason about this code where it's used). - Rewrite hottest parts in cpp?
@pat-s what do you think?
from fselectorrcpp.
Sure, whatever works :) You are th C++ expert here 👍
from fselectorrcpp.
from fselectorrcpp.
@MarcinKosinski @krzyslom can you look at the relief branch? There's the current version. I think the api will stay as is (same approach as we're using in information_gain
), so we can release it and then work on the speeding up the internals of relief
.
Any more tests will be welcomed ;)
from fselectorrcpp.
from fselectorrcpp.
Ok then. I'll schedule the release to Monday. @MarcinKosinski any uncaught errors will be your fault xD
from fselectorrcpp.
from fselectorrcpp.
from fselectorrcpp.
Hey, just created this PR https://github.com/mi2-warsaw/FSelectorRcpp/pull/86/files to check which files were changed.
I extended NEWS and added parameters verification. I'm good to merge to have the relief algorithm in the pkg copied from FSelector however I imagine the request here is to get the algorithm in C++ which is beyond my skills
from fselectorrcpp.
@MarcinKosinski I'll be porting this into cpp gradually;)
from fselectorrcpp.
Thanks guys!! 🎉
Finally being able to avoid {FSelector}.
from fselectorrcpp.
@zzawadz Do you have a rough ETA when this goes to CRAN?
from fselectorrcpp.
Maybe @MarcinKosinski knows?
from fselectorrcpp.
from fselectorrcpp.
from fselectorrcpp.
Related Issues (20)
- Segfault during covr
- 100% coverage HOT 5
- feature_search output HOT 2
- R^2 example in feature search. HOT 1
- Discretize warning HOT 1
- Enable dependent variable discretization the same as FSelector:::equal.frequency.binning.discretization HOT 14
- Enable FSelectoRcpp dealing with NAs in explanatory variables as in the RWeka::Discretize HOT 1
- cutOff_k description HOT 1
- Bug when list interface used inside function HOT 1
- Get back to CRAN HOT 4
- FSelector:::information.gain.body using FSelectorRcpp:::information_gain - become a part of the FSelector HOT 1
- Typos in the Movitation vignette HOT 2
- Add `integer2numeric` to `information_gain` and `discretize` functions. HOT 1
- Compare similarities and differences with FSelector and FSelecotrRcpp in a vignette
- Installation bug: "Error in if (nzchar(SHLIB_LIBADD)) SHLIB_LIBADD else character(): argument is of length zero" HOT 5
- RWeka::Discretize works and produces correct results while FSelectorRcpp::discretize issues many "Cannot find any split points for `Col_XXX`. Drops this column." warnings and incorrect results HOT 1
- Segfault in information_gain HOT 10
- solved
- Information gain equation in the documentation. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fselectorrcpp.