Giter VIP home page Giter VIP logo

Comments (24)

zzawadz avatar zzawadz commented on August 16, 2024 2

@krzyslom can you look at this? We can start by porting this as is, and then gradually replacing parts with the C++ code.

from fselectorrcpp.

krzyslom avatar krzyslom commented on August 16, 2024 2

@zzawadz I'll investigate it next weekend.

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024 1

I promise I'll loot at it. I think that I'll start from copying the whole function from FSelector and start replacing its parts with C++ code.

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024 1

@pat-s it's on CRAN now.

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

any update here? :)

from fselectorrcpp.

krzyslom avatar krzyslom commented on August 16, 2024

Hey @pat-s

I took a look at the function a while ago. Here are my thoughts.

  1. Controversial silent behavior e.g.
if (sample.size < 1) {
 warning(paste("Assumed: sample.size = ", sample.size))
 sample.size = 1
 sample_instances_idx = sample(1:instances_count, 1)
}

The function prints passed value while calling it assumed. Next the value is silently changed. This behavior occurs in several steps. Moreover we can pass double or a negative number to sample.size which does not make sense. Perhaps it is better to throw an error is such cases.

  1. There are several not exported functions that also have to be implemented in the package e.g. get.data.frame.from.formula and normalize.min.max.

  2. relief function is composed from several inner functions. Results of their call are assigned to global variables with <<- operator. The question is should we just try to make a copy-paste for this functionality, or should we try to redesign the function with future C++ interface in mind. Unfortunately I am not familiar with C++ so I do not have a good judgement in that case. I'm not sure if @zzawadz would write Rcpp implementation in similar fashion.

@zzawadz please take a look at the topic.

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

Sorry for the late follow-up.

I have no detailed knowledge about the algorithm but we would like to use a Rcpp version of it in both {mlr} and {mlr3filters}.

I just searched again and there is currently no other package implementing the RELIEF algorithm in R.

Compared to other filters, the java-based {FSelector} implementation takes really long. It would be a good enhancement if we could port this to {FSelectorRcpp}. This filter method is still able to achieve good results in benchmarks and I really think the effort would be worth it.

Sometimes a rewrite is even easier than porting existing code - but you are the experts here, I have no clue about the internals and know only very few about Rcpp.

@zzawadz Did you have time to look at all of this in the meantime?

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

CC @zzawadz

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

bump :)

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024

My plan is like this:

  1. contact author of this function and get approval for adapting his code? (do i need this? it's open source so maybe just mentioning the author will be sufficient - I have spent too much time in BigCo to be sure).
  2. Adapt the code to use more FSelectorRcpp's like interfaces.
  3. Remove <<- assignment (It's hard for me to reason about this code where it's used).
  4. Rewrite hottest parts in cpp?

@pat-s what do you think?

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

Sure, whatever works :) You are th C++ expert here 👍

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024

@MarcinKosinski @krzyslom can you look at the relief branch? There's the current version. I think the api will stay as is (same approach as we're using in information_gain), so we can release it and then work on the speeding up the internals of relief.

Any more tests will be welcomed ;)

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024

Ok then. I'll schedule the release to Monday. @MarcinKosinski any uncaught errors will be your fault xD

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

Hey, just created this PR https://github.com/mi2-warsaw/FSelectorRcpp/pull/86/files to check which files were changed.
I extended NEWS and added parameters verification. I'm good to merge to have the relief algorithm in the pkg copied from FSelector however I imagine the request here is to get the algorithm in C++ which is beyond my skills

from fselectorrcpp.

zzawadz avatar zzawadz commented on August 16, 2024

@MarcinKosinski I'll be porting this into cpp gradually;)

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

Thanks guys!! 🎉
Finally being able to avoid {FSelector}.

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

@zzawadz Do you have a rough ETA when this goes to CRAN?

from fselectorrcpp.

pat-s avatar pat-s commented on August 16, 2024

Maybe @MarcinKosinski knows?

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

MarcinKosinski avatar MarcinKosinski commented on August 16, 2024

from fselectorrcpp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.