Giter VIP home page Giter VIP logo

basic_language_comparison's People

Contributors

brenhinkeller avatar juleskouatchou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

basic_language_comparison's Issues

Jacobi iteration benchmark implementation not correct

I compare here the C-language implementation and Julia.

I noticed that the number of sweeps executed by these two implementations was different.
I tracked it down to the return line in timeStep4_Jacobi: this C function fails to return the square root of the error accumulator, which the Julia functions do. Hence, when the error is compared with the tolerance, the C-language implementation exits the while loop much sooner. The number of sweeps is consequently not equal between the implementations, which invalidates the comparison.

I think it would be best for the benchmark to fix the number of sweeps and not test the error in the loop condition.

Julia is kind of dead? why still include it

Julia is kind of good in the way that it has nearly identical syntax as matlab which makes it great for converting code. But as far as I've heard, it's a dead language. I worked with it for a while & it was great but I think the community support for it has also faded so I guess it would be better to not update it in future runs of the comparisons. Unless you know a niche sector that are still using it.

avoid unrelated comparisons

It is really interesting to see that Numba gets faster than Julia. However, during my repro, I found some benchmarks in this repo quite unrelated to the conclusions.

TL;DR;: After fixing the benchmark, the examination shows that Julia is >10x faster than Numba for the case evaluate_functions.

Consider the case evaluate_functions, numba uses parallel for loop. It seems to produce faster programs than other languages, however it is quite misunderstanding.

Programs using prange gives different results from that use range, while the latter is the behaviour of the corresponding Julia program.

In [17]: @njit(parallel=True)
    ...: def evaluate_functions(n):
    ...:     """
    ...:         Evaluate the trigononmetric functions for n values evenly
    ...:         spaced over the interval [-1500.00, 1500.00]
    ...:     """
    ...:     vector1 = np.linspace(-1500.00, 1500.0, n)
    ...:     iterations = 10000
    ...:     for i in range(iterations):
    ...:         vector2 = np.sin(vector1)
    ...:         vector1 = np.arcsin(vector2)
    ...:         vector2 = np.cos(vector1)
    ...:         vector1 = np.arccos(vector2)
    ...:         vector2 = np.tan(vector1)
    ...:         vector1 = np.arctan(vector2)
    ...:     return vector1
    ...:

In [18]: evaluate_functions(10)
Out[18]:
array([1.46030424, 1.13579218, 0.81128013, 0.48676808, 0.16225603,
       0.16225603, 0.48676808, 0.81128013, 1.13579218, 1.46030424])

In [19]: @njit(parallel=True)
    ...: def evaluate_functions(n):
    ...:     """
    ...:         Evaluate the trigononmetric functions for n values evenly
    ...:         spaced over the interval [-1500.00, 1500.00]
    ...:     """
    ...:     vector1 = np.linspace(-1500.00, 1500.0, n)
    ...:     iterations = 10000
    ...:     for i in prange(iterations):
    ...:         vector2 = np.sin(vector1)
    ...:         vector1 = np.arcsin(vector2)
    ...:         vector2 = np.cos(vector1)
    ...:         vector1 = np.arccos(vector2)
    ...:         vector2 = np.tan(vector1)
    ...:         vector1 = np.arctan(vector2)
    ...:     return vector1
    ...:

In [20]: evaluate_functions(10)
Out[20]:
array([-1500.        , -1166.66666667,  -833.33333333,  -500.        ,
        -166.66666667,   166.66666667,   500.        ,   833.33333333,
        1166.66666667,  1500.        ])

The Julia behaviour:

function evaluatefunctions(N)
           #x = linspace(-1500.0, 1500.0, N)
           x = collect(range(-1500.0, stop=1500.0, length=N))
           M = 10000
           for i in 1:M
               y = sin.(x)
               x = asin.(y)
               y = cos.(x)
               x = acos.(y)
               y = tan.(x)
               x = atan.(y)
           end
           return x
 end

julia> evaluatefunctions(N)
10-element Vector{Float64}:
 1.4603042376686257
 1.135792184853451
 0.8112801320381631
 0.48676807922287524
 0.16225602640761583
 0.16225602640761583
 0.48676807922287524
 0.8112801320381631
 1.135792184853451
 1.4603042376686257

Fixing the incorrect use of prange gives a fair result:

In [16]: %timeit evaluate_functions(10)
55.4 ms ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
julia> N = 10
10

julia> @btime evaluatefunctions(N)
  4.224 ms (60001 allocations: 8.24 MiB)

Java-Implementation for `testCountUniqueWords` is highly inefficient

You do

File f = new File(filename);
ArrayList arr = new ArrayList();
HashMap<String, Integer> listOfWords = new HashMap<String, Integer>(); 
Scanner in = new Scanner(f);
int i = 0;
while(in.hasNext()) {
  String s = in.next();
  arr.add(s.replaceAll("[^a-zA-Z]", "").toLowerCase());
}
Iterator itr=arr.iterator();
while(itr.hasNext()) {
  i++;
  listOfWords.put((String) itr.next(), i);
}
Set<Object> uniqueValues = new HashSet<Object>(listOfWords.values()); 

This is inefficient in multiple aspects:

  1. The String.replaceAll("[^a-zA-Z]", ...) parses and compiles the RegEx-Patter [^a-zA-Z] over and over again. You should use once a Pattern.compile("[^a-zA-Z]") before the loop and the reuse it within the loop
  2. You first apply the RegEx (to remove non-alpha-chars) and then apply the toLowerCase. It's better to first apply the lowercase and then have a simpler RegEx, namely [^a-z] as you will not find UPPERCASE-Letters any more.
  3. You first collect all words in an ArrayList, then loop over this ArrayList to fill a HashMap (where the unique words are the key) and finally you create a third collection (the HashSet). This all makes not much sense - you can directy fill a one-and-only HashSet in the first loop.

So make this (only the core part here):

HashSet<String> uniqueValues = new HashSet<>();
Pattern pattern = Pattern.compile("[^a-zA-Z]");
while (in.hasNext()) {
  String s = in.next();
  s = pattern.matcher(s.toLowerCase()).replaceAll("");
  uniqueValues.add(s);
}

This is ~3 times faster (after JVM-warm-up) than your implementation - with much less and simpler code

Note that also have such an optimal implementation for Python: https://github.com/JulesKouatchou/basic_language_comparison/blob/master/Python/test_count_unique_words.py#L29-L31

Fast-math option shouldn't be used

The results obtained with the fast-math option are not reliable, at least in general.
Hence I believe this option should be disabled for the purpose of this benchmark.

Avoid intermediate printing in benchmarks

Some benchmarks use quite a bit of printing to stdout during the benchmarks. This is not a good idea, and will in some cases dominate runtime.

In particular, the test_count_unique_words.jl prints each line before counting the words. Now, the current implementation of word count is very inefficient, but still, the printing of each line takes 10-100x as long as the word count itself. File reading will also dominate.

If the purpose of the benchmark is to measure the performance of string manipulation, then reading from file and printing to stdout will make the benchmark meaningless, as it almost only measures the latter two.

Julia is column major

This code is row major:

https://github.com/JulesKouatchou/basic_language_comparison/blob/master/Julia/test_laplace_jacobi_4.jl#L40-L55

function optimized_time_step(u::Matrix{T})  where {T}
    n, m = size(u)
    error = T(0.0) # not an optimization, but makes it work for all element types
    @inbounds for j = 2:m-1
        for i = 2:n-1 # @simd ivdep, but why is it writing into u[j,i]? That's not the time step.
            temp = u[j, i];
            u[j, i] = ( T(4) *(u[j-1, i] + u[j+1, i] +
                        u[j, i-1] + u[j, i+1]) +
                        u[j-1, i-1] + u[j+1, i+1] +
                        u[j+1, i-1] + u[j-1, i+1]) / T(20)
            difference = u[j, i] - temp
            error += difference*difference;
        end
    end
    return sqrt(error)
end

is probably a better code, though this is a case where most people would do:

using LoopVectorization
function optimized_time_step(u::Matrix{T})  where {T}
    n, m = size(u)
    error = T(0.0) # not an optimization, but makes it work for all element types
    @turbo for j = 2:m-1 # @tturbo for threading
        for i = 2:n-1
            temp = u[j, i];
            u[j, i] = ( T(4) *(u[j-1, i] + u[j+1, i] +
                        u[j, i-1] + u[j, i+1]) +
                        u[j-1, i-1] + u[j+1, i+1] +
                        u[j+1, i-1] + u[j-1, i+1]) / T(20)
            difference = u[j, i] - temp
            error += difference*difference;
        end
    end
    return sqrt(error)
end

Java-Implementation for `testLookAndSay` is inefficient

Please improve two Java-Codelines to avoid performance penaltys:

result.append(times + "" + repeat);  // YOUR code creates (finally millions) intermediate StringBuilder's
result.append(times).append(repeat);  // Better: THIS code avoids it - you already have a StringBuuilder
StringBuilder result = new StringBuilder(); // YOUR code creates Builder with a small (default-sized) buffer which must be increased over and over again while appending characters
StringBuilder result = new StringBuilder(number.length() * 2); // Better: THIS code directly creates a buffer with expected size

Please note that you also implemented in this optimized way for the C-version: https://github.com/JulesKouatchou/basic_language_comparison/blob/master/C/test_look_and_say.c#L18

Should respect a "warm-up" for Java-VM to let JIT-Compiler do its work

Of course, depending on what you want to show such a "warm-up" is or is not appropriate.

Though I propose to have a loop for(int i=0;i<100;i++) doTheTest(); around the core test methods.
This gives the Java-VM time to identify the method doTheTest() to be a hotspot and then to JIT-compile it for the next invocation.

Finally you have 100 measurement results. Propose to publish the first one as e.g. "Java cold" and an average of the last 10 ones as "Java warm / JIT compiled".

Add OCaml language for comparison

OCaml language community recently started to work hard on providing the infrastructure for scientific and domain-specific computing. OWL framework is one of the most developed ones for OCaml. See their GitHub repositories. And a companies like Imandra help to bring more correctness (using the sound OCaml type system) into the production level computations.

cc @mseri @ryanrhymes

Add chart showing lines of code by language

Since some languages may be slower but require fewer lines of code for most tasks, I suggest
adding a chart showing the LOC by language for each task.

Thanks for this project.

R recursion implementation is not an apples-to-apples comparison

When I looked at the benchmarks, I was particularly surprised because I couldn't imagine the recursive fib being that fast compared to C.

The R recursion example is not an apples-to-apples comparison, because it is using two accumulators. In fact the commented out R code at https://github.com/JulesKouatchou/basic_language_comparison/blob/master/R/test_fibonacci.R#L15 is the right one. Or alternatively, all languages should use the one with two accumulators.

For example, here's the Julia version that does what the R version does and runs as fast.

 function Fib(N)
           function F(N, a, b)
               N == 0 && return b
               F(N-1, a+b, a)
           end
           F(N, 1, 0)
       end

Much of this was done by @syxpek on the Julia slack. I am just reporting it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.