juleskouatchou / basic_language_comparison Goto Github PK

License: Other

C 5.93% Python 16.69% Fortran 10.34% IDL 2.02% Prolog 4.26% Java 8.14% Jupyter Notebook 0.02% Julia 9.31% MATLAB 5.15% R 11.05% Scala 6.87% Shell 20.22%

basic_language_comparison's Issues

Munchausen timing for Julia is probably off

I ran the https://github.com/JulesKouatchou/basic_language_comparison/blob/master/Julia/test_munchausen.jl script on my laptop and got times of 7s to 5s for Julia 0.6.4, 1.0 and 1.5. Thus the 100s on a Xenon-node seem off.

Julia is kind of dead? why still include it

Julia is kind of good in the way that it has nearly identical syntax as matlab which makes it great for converting code. But as far as I've heard, it's a dead language. I worked with it for a while & it was great but I think the community support for it has also faded so I guess it would be better to not update it in future runs of the comparisons. Unless you know a niche sector that are still using it.

Add chart showing lines of code by language

Since some languages may be slower but require fewer lines of code for most tasks, I suggest
adding a chart showing the LOC by language for each task.

Thanks for this project.

avoid unrelated comparisons

It is really interesting to see that Numba gets faster than Julia. However, during my repro, I found some benchmarks in this repo quite unrelated to the conclusions.

TL;DR;: After fixing the benchmark, the examination shows that Julia is >10x faster than Numba for the case evaluate_functions.

Consider the case evaluate_functions, numba uses parallel for loop. It seems to produce faster programs than other languages, however it is quite misunderstanding.

Programs using prange gives different results from that use range, while the latter is the behaviour of the corresponding Julia program.

In [17]: @njit(parallel=True)
    ...: def evaluate_functions(n):
    ...:     """
    ...:         Evaluate the trigononmetric functions for n values evenly
    ...:         spaced over the interval [-1500.00, 1500.00]
    ...:     """
    ...:     vector1 = np.linspace(-1500.00, 1500.0, n)
    ...:     iterations = 10000
    ...:     for i in range(iterations):
    ...:         vector2 = np.sin(vector1)
    ...:         vector1 = np.arcsin(vector2)
    ...:         vector2 = np.cos(vector1)
    ...:         vector1 = np.arccos(vector2)
    ...:         vector2 = np.tan(vector1)
    ...:         vector1 = np.arctan(vector2)
    ...:     return vector1
    ...:

In [18]: evaluate_functions(10)
Out[18]:
array([1.46030424, 1.13579218, 0.81128013, 0.48676808, 0.16225603,
       0.16225603, 0.48676808, 0.81128013, 1.13579218, 1.46030424])

In [19]: @njit(parallel=True)
    ...: def evaluate_functions(n):
    ...:     """
    ...:         Evaluate the trigononmetric functions for n values evenly
    ...:         spaced over the interval [-1500.00, 1500.00]
    ...:     """
    ...:     vector1 = np.linspace(-1500.00, 1500.0, n)
    ...:     iterations = 10000
    ...:     for i in prange(iterations):
    ...:         vector2 = np.sin(vector1)
    ...:         vector1 = np.arcsin(vector2)
    ...:         vector2 = np.cos(vector1)
    ...:         vector1 = np.arccos(vector2)
    ...:         vector2 = np.tan(vector1)
    ...:         vector1 = np.arctan(vector2)
    ...:     return vector1
    ...:

In [20]: evaluate_functions(10)
Out[20]:
array([-1500.        , -1166.66666667,  -833.33333333,  -500.        ,
        -166.66666667,   166.66666667,   500.        ,   833.33333333,
        1166.66666667,  1500.        ])

The Julia behaviour:

function evaluatefunctions(N)
           #x = linspace(-1500.0, 1500.0, N)
           x = collect(range(-1500.0, stop=1500.0, length=N))
           M = 10000
           for i in 1:M
               y = sin.(x)
               x = asin.(y)
               y = cos.(x)
               x = acos.(y)
               y = tan.(x)
               x = atan.(y)
           end
           return x
 end

julia> evaluatefunctions(N)
10-element Vector{Float64}:
 1.4603042376686257
 1.135792184853451
 0.8112801320381631
 0.48676807922287524
 0.16225602640761583
 0.16225602640761583
 0.48676807922287524
 0.8112801320381631
 1.135792184853451
 1.4603042376686257

Fixing the incorrect use of prange gives a fair result:

In [16]: %timeit evaluate_functions(10)
55.4 ms ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

julia> N = 10
10

julia> @btime evaluatefunctions(N)
  4.224 ms (60001 allocations: 8.24 MiB)

Avoid intermediate printing in benchmarks

Some benchmarks use quite a bit of printing to stdout during the benchmarks. This is not a good idea, and will in some cases dominate runtime.

In particular, the test_count_unique_words.jl prints each line before counting the words. Now, the current implementation of word count is very inefficient, but still, the printing of each line takes 10-100x as long as the word count itself. File reading will also dominate.

If the purpose of the benchmark is to measure the performance of string manipulation, then reading from file and printing to stdout will make the benchmark meaningless, as it almost only measures the latter two.

Add Rust language for comparison

Rust has the striving computational community, e.g:

rustsim computer simulation community
nalgebra linear algebra framework
rust-sci community

See a good introduction to scientific computing with Rust

R recursion implementation is not an apples-to-apples comparison

When I looked at the benchmarks, I was particularly surprised because I couldn't imagine the recursive fib being that fast compared to C.

The R recursion example is not an apples-to-apples comparison, because it is using two accumulators. In fact the commented out R code at https://github.com/JulesKouatchou/basic_language_comparison/blob/master/R/test_fibonacci.R#L15 is the right one. Or alternatively, all languages should use the one with two accumulators.

For example, here's the Julia version that does what the R version does and runs as fast.

 function Fib(N)
           function F(N, a, b)
               N == 0 && return b
               F(N-1, a+b, a)
           end
           F(N, 1, 0)
       end

Much of this was done by @syxpek on the Julia slack. I am just reporting it.

Java-Implementation for `testCountUniqueWords` is highly inefficient

You do

File f = new File(filename);
ArrayList arr = new ArrayList();
HashMap<String, Integer> listOfWords = new HashMap<String, Integer>(); 
Scanner in = new Scanner(f);
int i = 0;
while(in.hasNext()) {
  String s = in.next();
  arr.add(s.replaceAll("[^a-zA-Z]", "").toLowerCase());
}
Iterator itr=arr.iterator();
while(itr.hasNext()) {
  i++;
  listOfWords.put((String) itr.next(), i);
}
Set<Object> uniqueValues = new HashSet<Object>(listOfWords.values());

This is inefficient in multiple aspects:

The String.replaceAll("[^a-zA-Z]", ...) parses and compiles the RegEx-Patter [^a-zA-Z] over and over again. You should use once a Pattern.compile("[^a-zA-Z]") before the loop and the reuse it within the loop
You first apply the RegEx (to remove non-alpha-chars) and then apply the toLowerCase. It's better to first apply the lowercase and then have a simpler RegEx, namely [^a-z] as you will not find UPPERCASE-Letters any more.
You first collect all words in an ArrayList, then loop over this ArrayList to fill a HashMap (where the unique words are the key) and finally you create a third collection (the HashSet). This all makes not much sense - you can directy fill a one-and-only HashSet in the first loop.

So make this (only the core part here):

HashSet<String> uniqueValues = new HashSet<>();
Pattern pattern = Pattern.compile("[^a-zA-Z]");
while (in.hasNext()) {
  String s = in.next();
  s = pattern.matcher(s.toLowerCase()).replaceAll("");
  uniqueValues.add(s);
}

This is ~3 times faster (after JVM-warm-up) than your implementation - with much less and simpler code

Note that also have such an optimal implementation for Python: https://github.com/JulesKouatchou/basic_language_comparison/blob/master/Python/test_count_unique_words.py#L29-L31

Use a newer version of Julia

It would be great to use the current stable version of Julia (1.3) or at least the long term stable version (1.0) for the results that are published here:

https://modelingguru.nasa.gov/docs/DOC-2783

Add OCaml language for comparison

OCaml language community recently started to work hard on providing the infrastructure for scientific and domain-specific computing. OWL framework is one of the most developed ones for OCaml. See their GitHub repositories. And a companies like Imandra help to bring more correctness (using the sound OCaml type system) into the production level computations.

cc @mseri @ryanrhymes

Jacobi iteration benchmark implementation not correct

I compare here the C-language implementation and Julia.

I noticed that the number of sweeps executed by these two implementations was different.
I tracked it down to the return line in timeStep4_Jacobi: this C function fails to return the square root of the error accumulator, which the Julia functions do. Hence, when the error is compared with the tolerance, the C-language implementation exits the while loop much sooner. The number of sweeps is consequently not equal between the implementations, which invalidates the comparison.

I think it would be best for the benchmark to fix the number of sweeps and not test the error in the loop condition.

Java-Implementation for `testLookAndSay` is inefficient

Please improve two Java-Codelines to avoid performance penaltys:

result.append(times + "" + repeat);  // YOUR code creates (finally millions) intermediate StringBuilder's
result.append(times).append(repeat);  // Better: THIS code avoids it - you already have a StringBuuilder

StringBuilder result = new StringBuilder(); // YOUR code creates Builder with a small (default-sized) buffer which must be increased over and over again while appending characters
StringBuilder result = new StringBuilder(number.length() * 2); // Better: THIS code directly creates a buffer with expected size

Please note that you also implemented in this optimized way for the C-version: https://github.com/JulesKouatchou/basic_language_comparison/blob/master/C/test_look_and_say.c#L18

Fast-math option shouldn't be used

The results obtained with the fast-math option are not reliable, at least in general.
Hence I believe this option should be disabled for the purpose of this benchmark.

Julia is column major

This code is row major:

https://github.com/JulesKouatchou/basic_language_comparison/blob/master/Julia/test_laplace_jacobi_4.jl#L40-L55

function optimized_time_step(u::Matrix{T})  where {T}
    n, m = size(u)
    error = T(0.0) # not an optimization, but makes it work for all element types
    @inbounds for j = 2:m-1
        for i = 2:n-1 # @simd ivdep, but why is it writing into u[j,i]? That's not the time step.
            temp = u[j, i];
            u[j, i] = ( T(4) *(u[j-1, i] + u[j+1, i] +
                        u[j, i-1] + u[j, i+1]) +
                        u[j-1, i-1] + u[j+1, i+1] +
                        u[j+1, i-1] + u[j-1, i+1]) / T(20)
            difference = u[j, i] - temp
            error += difference*difference;
        end
    end
    return sqrt(error)
end

is probably a better code, though this is a case where most people would do:

using LoopVectorization
function optimized_time_step(u::Matrix{T})  where {T}
    n, m = size(u)
    error = T(0.0) # not an optimization, but makes it work for all element types
    @turbo for j = 2:m-1 # @tturbo for threading
        for i = 2:n-1
            temp = u[j, i];
            u[j, i] = ( T(4) *(u[j-1, i] + u[j+1, i] +
                        u[j, i-1] + u[j, i+1]) +
                        u[j-1, i-1] + u[j+1, i+1] +
                        u[j+1, i-1] + u[j-1, i+1]) / T(20)
            difference = u[j, i] - temp
            error += difference*difference;
        end
    end
    return sqrt(error)
end

What is the licensing of this repository?

The copyright statement doesn't specify the licensing.

Should respect a "warm-up" for Java-VM to let JIT-Compiler do its work

Of course, depending on what you want to show such a "warm-up" is or is not appropriate.

Though I propose to have a loop for(int i=0;i<100;i++) doTheTest(); around the core test methods.
This gives the Java-VM time to identify the method doTheTest() to be a hotspot and then to JIT-compile it for the next invocation.

Finally you have 100 measurement results. Propose to publish the first one as e.g. "Java cold" and an average of the last 10 ones as "Java warm / JIT compiled".

juleskouatchou / basic_language_comparison Goto Github PK

basic_language_comparison's Issues

Munchausen timing for Julia is probably off

Julia is kind of dead? why still include it

Add chart showing lines of code by language

avoid unrelated comparisons

Avoid intermediate printing in benchmarks

Add Rust language for comparison

R recursion implementation is not an apples-to-apples comparison

Java-Implementation for `testCountUniqueWords` is highly inefficient

Use a newer version of Julia

Add OCaml language for comparison

Jacobi iteration benchmark implementation not correct

Java-Implementation for `testLookAndSay` is inefficient

Fast-math option shouldn't be used

Julia is column major

What is the licensing of this repository?

Should respect a "warm-up" for Java-VM to let JIT-Compiler do its work

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent