Comments (9)
May I add some things?!
I have seen some double comparisons with a ==
, which is not numerically stable, e.g.
if(a == b && a == x)
in distributions.Uniform:26 (and in many other lines).
I would change every occurrence in the code, but usually I use apache-commons-math.Precision
.
Is it your goal to keep JSAT without any external dependencies?
References:
[https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/]
[https://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/util/Precision.html#equals%28double,%20double,%20int%29]
[http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html#689]
In addition, I already corrected some java docs. You can see it in my dev branch. Hell, you wrote a lot of code. How long have you been developing JSAT?
from jsat.
May I add some things?!
Of course!
in distributions.Uniform:26 (and in many other lines).
In that case it is actually safe. if a == b && a == x is the case where the uniform distribution is over one infinitesimal point, which really doesn't make sense, but is needed because otherwise the (a <= x && x <= b) case would apply, and result in a NaN from a division by zero.
I'm generally aware of the dangers of floating point, but there are instances where such == checks are done intentionally, such as the case in the Uniform distribution. There are a few other instances of code smells in JSAT that are done intentionally, hopefully I did a better job documenting them then I did that one in Uniform.java
I already corrected some java docs.
Feel free to do a pull request on them!
Hell, you wrote a lot of code. How long have you been developing JSAT?
Been a free time project for just over 4 years now :) I'm very much a "learn by doing" kinda guy, so whenever I see a paper I'm interested in I just go and try and implement it when I have the time. Though I did have more free time before working for money :-/
from jsat.
Feel free to do a pull request on them!
I will try to do so, but I already added some things which I would say are necessary (apache-commons-math for safe double comparison; plan to add more, e.g. trove for primitive lists).
My plan is to create a Maven repo that contains my JSAT version so I also changed the pom.xml and will even add more stuff.
I will try to cherry pick the safe changes without adding dependencies etc.
I ran findbugs on the project and it found some bad double compares and x == Double.NaN checks.
What do you think about including some de facto standard libraries for such hassles?
from jsat.
As I said, some of those checks are done intentionally.
Right now I do not want to have any dependencies in JSAT.
Also, FYI - there is some code where you will experience a serious performance regression in using Trove instead of the code I have for primitive maps. (And some code that is on my local repo and not committed yet).
from jsat.
As I said, some of those checks are done intentionally.
I think I found some places where the checks should not be done but I will write a testcase first which may take some time. Will let you know when it's done.
Do you have any advice how to compare two doubles for equality in JSAT without an external library? I have an array of doubles and want to check whether they are all the same.
Right now I do not want to have any dependencies in JSAT.
Ok :(
Also, FYI - there is some code where you will experience a serious performance regression in using Trove instead of the code I have for primitive maps. (And some code that is on my local repo and not committed yet).
I would use them e.g. instead of List to prevent a lot of AutoBoxing (I had an algorithm where removing boxing/unboxing increased the algorithm by factor 2).
from jsat.
Feel free to email me the places where you have concerns and I will review them
You need to decide what level of "sameness" you need. In ML we usually don't care about the difference in value less than 1e-3, so doing abs(a-b) < 1e-3 should be good for most cases. In the Vec class there is a compare to method that takes a threshold on the absolute difference.
As I said, I have my own primitive collections in JSAT, I did not say to use generics.
from jsat.
As I said, I have my own primitive collections in JSAT, I did not say to use generics.
Ahh, found them by now. Is there a reason why you use List/List then? E.g. NaiveBayes.getSampleVariableVector (l. 388)
from jsat.
That code was written before I created the DoubleList class, and I missed changing that one.
from jsat.
I would separate abstract classes/interfaces from concrete implementations by putting them into sub-packages.
from jsat.
Related Issues (20)
- Kernal Density Estimation HOT 1
- question on image classifcation HOT 1
- Do not limit the elements in VPTree to vectors HOT 1
- KMeans data load
- Consider moving to a more liberal license HOT 1
- Question: Mean-Shift Starting point HOT 2
- do you want to release new version to maven central? HOT 1
- JSAT and Graylog plugin HOT 1
- Update Maven HOT 2
- Decision Tree Algorithm HOT 1
- SparseVector nulls all numerical values HOT 1
- PAM does not implement PAM HOT 18
- How to use for LVQ HOT 1
- How to create maven project for JSAT HOT 3
- Using MeanShift with given bandwidth?
- NullPointer getting thrown when using BDS feature selector on ClassificationDataSet HOT 3
- Null pointer when using BDS feature selector HOT 1
- Increase the usage of compound assignment operators
- OrdinaryKriging Infiniy HOT 2
- typos HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jsat.