-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The fastest languages are cheating #119
Comments
@983 amazing investigative work on this. It's interesting to see how the compiler is optimizing the recursive calls. I am open to suggestions on how to improve the fairness of the benchmark. |
I can see the following options:
The two last options do not completely fix the problem, because gcc will still inline the last recursive call. To get rid of that, also specify The Nim, fortran and cython benchmark can be fixed the same way:
In case that this behavior should change in the future, here is my process to determine which flags were responsible:
Nevertheless, I think a compiler which is very good at inlining should still be honored in some way. Maybe the running time with all optimizations enabled could be included in another column or table? |
The average time 10 developers use to find the compiler flags to go from "regularly compiled code" to "optimized code" could be included in the table. This would make the results for the optimized Nim code shine. |
I think the best solution is to implement #85 and change the exmamples to take a parameter . This should prevent some of the optimizations made here but still allow for languages to shine that have better inlining. For now, I have added the flag |
In a way I both agree and disagree with the comment: compiler optimization should not be restricted, if a language allows it, so much the better. I think however the optimized folder contains code that is "human" optimized (see my code with the lisp memoized function I opened in the issues); and that may be cheating. Maybe fibonacci func is not the best to benchmark for that specific reason? |
I have moved back to allowing compiler optimizations. However, I will only allow compiler flags that are deemed safe and "release" quality i.e. -O3. |
Naive recursive Fibonacci, O(Fib(N)) is definitely not representative of most code. The fact that there's an O(N) algorithm (iterative If you choose naive recursive Fibonacci as your test function in the first place, surely the point is to test how well compilers convert inefficient double-recursion to something like a loop containing single recursion, or even something better. And function-call overhead since a single addition is a tiny amount of work for a single function call. Doubly-recursive functions for traversing binary trees are common in real programs, but they differ in not having redundancy / common-subexpressions with each other. So that recursion pattern is one that's important for compilers to know how to optimize some. If you want to know how a language performs in general for a variety of computational problems, naive Fibonacci isn't a good choice, but it is an interesting and well-known problem to benchmark. (https://stackoverflow.com/questions/76611371/execution-time-of-recursive-fibonacci-function-is-slower-in-c-than-equivalent-ni is an example of someone being puzzled by Nim being faster than C on this benchmark.) Perhaps some text somewhere pointing out that apparently trivial differences can lead to major compiler optimizations could avoid confusing people? |
When compiling Nim to C++, this is the simplified result:
Nothing wrong here yet. But when compiled with
g++
, the disassembled program will look similar to this:Apparently,
fib
is only called for the parameters34, 35, 36, 37, 38, 39, 40
and then the results are added up. This is not a fair comparison, because Nim is not doing the same computations as the other languages. I believe that Nim should be moved to the sectionOptimized
. Otherwise, one would have to argue how much unrolling is ok and how much is too much, but there is no good answer for that.Likewise, C and C++ partially unroll the fib function calls, although not as far. I have not checked the Fortran and Cython binaries, but they are probably doing the same thing.
It was suggested in another issue that the parameter
n
should be configurable to prevent optimization. However, that is insufficient since it still allows to unroll thefib
function for small values ofn
. The only solution seems to be to inspect the generated assembly codeHere is a fair assembly implementation to compare against. If a program is faster, it is likely cheating in some way or another.
The text was updated successfully, but these errors were encountered: