Giter VIP home page Giter VIP logo

Comments (30)

abelsiqueira avatar abelsiqueira commented on June 2, 2024 1

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024
  1. Title of the package: Why "Algebraic"? It uses numerical methods and not symbolic differentiation.

    The term "algebraic" refers to the type of problem solved and not to the programming techniques of the algorithm. Although it is not a common term in adjustment functions, it serves to differ from the geometric fit (geometrical fit functions). In the case of geometric fit, we are interested in minimizing the sum of deviations of the type min_x || (x, phi (t, x)) - (t_i, y_i) || (ie, orthogonal projection on the curve to be adjusted). In the algebraic case, we are interested in minimizing the sum of deviations of type || phi (t_i, x) - y_i ||. We chose this term to make the package's name more readable. This information has been added in the Introduction.

  2. First line: use \mathbb{R} instead of \mathcal{R}

  3. Prefer Package.jl to Package to designate it

  4. Second paragraph of "Summary". "adjustment of mathematical functions to data" is also referred to as regression, which is not mentioned in the paper, while it is fundamentally the purpose of the package.

    We have added the term regression in the manuscript.

  5. "Detection of outliers is always regarded ...". It's hard to see what the authors mean here. Without proper references backing up the statement, this seems rather vague.

    We have added to the second paragraph more information about techniques to deal with outlier detection. Also, we have described the impact of outliers when regression is performed by the sum-of-squares technique.

  6. "Recently, a good review", the use of "good" is subjective (what is a good review, in what context). A review of what?

    TODO: @evcastelani

  7. In statistics, it is common to use f(x; θ) for a function of x with parameters theta, which one tries to fit in a regression task.

    We have changed the whole package to adopt this statistical way of using. Now x refers to the model's variables and θ to its parameters, which we actually want to find.

  8. In paragraph "Functionality", the equation is not homogeneous with the rest:
    ϕ(x,t) instead of ϕ(x;t), prefer ";" to indicate the separation of data and parameters.

    We did not use ; in the code, as ; is used for optional parameters in Julia. We have used \phi(x, \theta) in the whole manuscript, so the model can be written in the same way in Julia.

  9. There could be a section on the method itself, or at least more development, including on the LOVO algorithm. For instance, the method is not black-box optimization but assumes the existence of gradient, thus making it first-order. Furthermore, if gradients are not provided by users, the use of ForwardDiff makes it mandatory to have the function defined in a generic way (accepting any Real number).
    Thus, how is the proposed method different from using other first-order methods (such as the ones in Optim.jl or JuliaSmoothOptimizers)? This should be detailed.

    1. New section defining the LOVO problem.
    2. We tried to make it clear the difference between derivatives of the model and the derivatives of the LOVO function
    3. For solving LOVO problems, there are first-order or second-order algorithms [ref Yano]. RAFF.jl implements a Levenberg-Marquardt algorithm (which is first-order) in the context of LOVO problems. This implementation is a novel algorithm. LOVO problems need the number of possible outliers to be given by the user. RAFF.jl solves this limitation by implementing a voting system. A voting system is a brute force algorithm, where several LOVO problems are solved with different numbers of possible outliers. The solution which most occurs among them is declared as the solution by RAFF.jl.

    This discussion has been added in the end of the Background section.

  10. "it is obvious that its arguments [...]", avoid the use of "obvious" which is subjective, and develop what is meant here.

    This phrase has been removed, since we replaced the example by a more intuitive one.

  11. "Additional features" paragraph, sentence: "the distributed version is a centralized ..." is a contradiction and should be re-phrased.

    Fixed. The distributed version is a master-slave implementation.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

@evcastelani thanks for the answer list. For the algebraic part, putting this explanation and/or references in the paper would be useful

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

ping me when the editions are done

from raff.jl.

abelsiqueira avatar abelsiqueira commented on June 2, 2024

Review

The authors implement a solver for data fitting that automatically detects outliers, based on algorithms using Lower Order-Value Optimization. The package meets the criteria for publication on JOSS, but there are a few items that should be addressed, and a few suggestions.

  • What kind of optimal fitting is achieved? Seems to be least squares (for the data without the outliers), but it's worth mentioning in the summary;
  • In what sense are the outliers defined? Some quantile?
  • Since the problem in the example is nonlinear, it should be indicated which solver was used to solve the problem, and which starting point;
  • I think that a comparison with a different fitting not so sensitive to outliers would improve the quality of the paper. Perhaps fitting with norm-1 or Huber loss.
  • The releases are not being registered. The Julia register has 0.4.3, but there is a release 0.6.0 already and the docs are being updated according to the releases;
  • In the first paragraph of Additional features: "...provided, what results in...", change "what" to "which".
  • On Figure 1, move the legend to the right, use a stronger color for the data points that are not outliers, and make it black-and-white/color-blind friendly by using a different marker style for the outliers and changing line-styles (I'd remove the "True solution").

Suggestions: These can be disregarded it the authors prefer:

  • In the context of regression, I find that it's clearer to use x and y for the data and \beta or \theta for the parameters being optimized. Our traditional least squares problem min |Ax - b|^2 is usually written as min |Xβ - y|^2.
  • I find that entering the data separately for x and y is better, e.g.
x = range(-1, 1, length=100)
y = 2 .+ 3x + randn(100) * 0.1
model(β, x) = β[1] + β[2] * x[1]
output = raff(model, x, y, 2) # Here

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024

@abelsiqueira , Thank you for your comments. Here are the answers for your suggestions. Most of them have been included in the new version of the manuscript.

  • What kind of optimal fitting is achieved? Seems to be least squares (for the data without the outliers), but it's worth mentioning in the summary;

    We have added an explanation regarding the kind of optimal fit that RAFF.jl finds in the Introduction. It is in the same sense as the traditional nonlinear regression or least squares.

  • In what sense are the outliers defined? Some quantile?

    Outliers are defined as the points for which the deviation to the model are the highest. RAFF.jl uses a voting system to try to detect the number of points that can be considered as outliers. To do this, it uses the ideas behind LOVO problems. A more formal description of those problems has been added in the Background section.

  • Since the problem in the example is nonlinear, it should be indicated which solver was used to solve the problem, and which starting point;

    We have replaced the example, since it was causing confusion about the problem that is solved. We hope that the section Background and the new example will make it clear the type of problem that we solve. We have also highlighted the fact that we implement a novel algorithm for LOVO problems, based on the classical Levenberg-Marquardt algorithm. LOVO problems cannot be solved by any nonlinear algorithm and we expected to achieve second-order convergence with this approach.

  • I think that a comparison with a different fitting not so sensitive to outliers would improve the quality of the paper. Perhaps fitting with norm-1 or Huber loss.

    We do not think that the 1-norm would be more robust than the least squares (suppose that the errors are smaller than 1, for example). In addition, the problem would become non-differentiable and harder to solve. Moreover, one can always create a sufficiently far outlier so that the "more robust" approaches also will construct poor models. Our new example was intended to show how could one benefit for detecting and eliminating outliers, which is different from simply tolerating them. Although it is easy to see the two outliers, when the dimension of the problem increases it becomes impossible to visually detect and remove them.

  • The releases are not being registered. The Julia register has 0.4.3, but there is a release 0.6.0 already and the docs are being updated according to the releases;

    We have submitted the two new versions of RAFF.jl to registration.

  • In the first paragraph of Additional features: "...provided, what results in...", change "what" to "which".

  • On Figure 1, move the legend to the right, use a stronger color for the data points that are not outliers, and make it black-and-white/color-blind friendly by using a different marker style for the outliers and changing line-styles (I'd remove the "True solution").

    The figure was replaced by a better example. We kept the "True solution" so readers can see how close to the solution is the parameters found by RAFF.

Suggestions: These can be disregarded it the authors prefer:

  • In the context of regression, I find that it's clearer to use x and y for the data and \beta or \theta for the parameters being optimized. Our traditional least squares problem min |Ax - b|^2 is usually written as min |Xβ - y|^2.

  • I find that entering the data separately for x and y is better, e.g.

    x = range(-1, 1, length=100)
    y = 2 .+ 3x + randn(100) * 0.1 
    model(β, x) = β[1] + β[2] * x[1]
    output = raff(model, x, y, 2) # Here
    

    The idea is great. It will be added to project in a future release. See #19 .

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024

Dear @abelsiqueira and @matbesancon , Thank you for your valuable comments and suggestions. We have added responses to most of your comments and questions. If there is no comment, it is because it has been fully accepted.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

Thank you I should have time to review the changes between now and monday

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

Second review

Notation

Page 2: "we can sort the set in ascending order"
The notation below is F_{i(theta)}(theta) while in the rest of the paper, it is F_{i}(theta), I didn't see why.
For the notation again, it is counter-intuitive that the functions get re-ordered for every theta value. Because of this, it is hard to see how S_p(theta) is not differentiable (because p is still fixed, but the re-ordering of the indices switches which function is where).

Method

"RAFF.jl" solves this limitation by implementing a voting system. It is not clear from the end of last paragraph which limitation the authors are referring to.

How is the voting system brute-force?

"The solution which most occurs"

Solutions are not countable (two solutions are equal iff the functions yield the same image at any in any point in the domain, does that even happen with varying points?)

Phrasing

i.e., it solves the problem ... "i.e." is strangely spaced.

One of the most usual ways to solve the problem of least-squares

It would be good to emphasize "problem of non-linear least-squares", otherwise pseudo-matrix inversion would be preferred.

"master-slave implementation" --> prefer primary-worker / primary-secondary or other terms. master-slave now tends to be avoided in technical contexts.

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024

Second review

Notation

Page 2: "we can sort the set in ascending order"
The notation below is F_{i(theta)}(theta) while in the rest of the paper, it is F_{i}(theta), I didn't see why.
For the notation again, it is counter-intuitive that the functions get re-ordered for every theta value. Because of this, it is hard to see how S_p(theta) is not differentiable (because p is still fixed, but the re-ordering of the indices switches which function is where).

The LOVO function is not very easy to understand. One has to spend some time getting used to this terminology and notation. The easiest way of seeing the non differentiability of S_p(theta) is by observing that f_min(theta) = S_p(theta) for all theta, as discussed in the paragraph starting at line 15 on pg. 2. Also, f_min highlights the combinatorial spirit of the LOVO problem.

Regarding the notation, it is correct. When we write F_{i_1(theta)}(theta) we mean that it is the smallest value in the set {F_{i}(theta), i = 1, ..., m} for a given theta. F_{i_2(theta)}(theta) is the second smallest value in the set, and so on. The procedure for obtaining this is by evaluating all F_i at theta and ordering the set {F_{i}(theta), i = 1, ..., m}. That's how the indices i_{k}(theta) are obtained. Since the values of F_i change for different values of theta, the order of the F_i's change too.

Therefore, we cannot change the notation, but we will add a small example to explain it better. We will ping you when it is finished.

Method

"RAFF.jl" solves this limitation by implementing a voting system. It is not clear from the end of last paragraph which limitation the authors are referring to.

How is the voting system brute-force?

"The solution which most occurs"

Solutions are not countable (two solutions are equal iff the functions yield the same image at any in any point in the domain, does that even happen with varying points?)

We have rephrase this paragraph to:

In this voting system, several LOVO subproblems are solved
with different values for $p$, the number of possible trusted
points. Each solution of a LOVO subproblem is associated to a vector
parameter $\theta$. The vector parameters are compared against each
other using the Euclidian distance, where small distances (using a
threshold) are considered the same solution. The parameter $\theta^*$
which most occurs among them is declared as the solution.

Phrasing

i.e., it solves the problem ... "i.e." is strangely spaced.

One of the most usual ways to solve the problem of least-squares

It would be good to emphasize "problem of non-linear least-squares", otherwise pseudo-matrix inversion would be preferred.

"master-slave implementation" --> prefer primary-worker / primary-secondary or other terms. master-slave now tends to be avoided in technical contexts.

Done. Thanks for the observation

from raff.jl.

abelsiqueira avatar abelsiqueira commented on June 2, 2024
  • At the end Functionality, mention that two points were manually set to be outliers.

I'm otherwise satisfied with the updates and the new version.

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024
* At the end Functionality, mention that two points were manually set to be outliers.

I'm otherwise satisfied with the updates and the new version.

You are correct. I have added the comment in the same phrase describing of the red triangles.

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024

@matbesancon We have added the simple example regarding the LOVO function. I hope it becomes clearer what we mean as its non differentiability.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

Thanks, it looks good. Minor last things:

[...] therefore, can be run locally or in a cluster of computers

Shouldn't it be "on a cluster of computers"?

used by all experimental researchers who know a little about ...

This sentence looks a bit informal and can be interpreted in different ways: is it for experimental researchers only if they know a little, by experimental researchers even if they know little about, etc.

from raff.jl.

fsobral avatar fsobral commented on June 2, 2024

Shouldn't it be "on a cluster of computers"?

Yes, you are right. Done.

used by all experimental researchers who know a little about ...

This sentence looks a bit informal and can be interpreted in different ways: is it for experimental researchers only if they know a little, by experimental researchers even if they know little about, etc.

The idea is that they do not need to know much about mathematical modeling, but definitely, they must know something, otherwise writing Julia functions defining models won't make any sense. I have rewritten this sentence to

This package is intended to be used by any experimental researcher
with a little knowledge about mathematical modeling and fitting
functions.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

I'm ok with the changes thanks. Last tiny things in the reference, the letters are not properly capitalizd, you should use title = {{Title}} instead of title = {Title} everywhere

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

Hi @matbesancon! You're right. The title of the references had to be standardized. Thanks for the note. However, running locally by Pandoc, the use of {{tilte}} worked like a charm but in JOSS no. In this way, I made the manual correction of capital letters to avoid possible errors.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

Hi, no it seems to have worked, the titles are properly capitalized. I think this was the last point on my side.
@abelsiqueira should we close this issue?

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

Great, doing one last check on the last version of the software in an hour or so, then validating

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

sorry last small details in "installation and usage":
"and can be directly installed from the Julia package repository" or "and can be directly installed from Julia's package repository" but the former might be preferred.

"All descriptions" or "The whole description"

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

@matbesancon you're right. These typos have been fixed.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

ok great, the paper seems fine, watch out builds seem to all fail

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

@matbesancon Ok, thanks! Now we fixed the problem and both branches (master and joss) are passing the automated tests.

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

perfect, I think this is good on my side. @abelsiqueira do you still have something on the last version of the paper?

from raff.jl.

abelsiqueira avatar abelsiqueira commented on June 2, 2024

Reference Keles needs to be updated (DOI, pages and author name):

@article{Kele2018,
  doi = {10.15345/iojes.2018.03.013},
  url = {https://doi.org/10.15345/iojes.2018.03.013},
  year = {2018},
  publisher = {International Online Journal of Educational Sciences},
  volume = {10},
  number = {3},
  author = {Taliha Kele{\c{s}}},
  title = {Comparison of Classical Least Squares and Orthogonal Regression in Measurement Error Models},
  journal = {International Online Journal of Educational Sciences}
}

Found the DOI on the paper and the bibtex using https://www.doi2bib.org/bib/10.15345/iojes.2018.03.013.

That's it for me.

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

@abelsiqueira you're right! We fixed DOI,name and pages as you suggest.

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

@matbesancon and @abelsiqueira Can I close this issue?

from raff.jl.

matbesancon avatar matbesancon commented on June 2, 2024

from raff.jl.

evcastelani avatar evcastelani commented on June 2, 2024

@matbesancon Ok, you're right.

from raff.jl.

abelsiqueira avatar abelsiqueira commented on June 2, 2024

Yes, close it, thanks for the corrections.

from raff.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.