-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow string performance #505
Comments
I have prepared a stripped down version of my program that reproduces this problem. |
Thanks! Could you plase clarify how long it takes to run the Lua and Pallene versions of this benchmark? I'm confused if this is the one where Pallene took 0.3 seconds or 1.2 seconds or if it is yet another number. Another thing that I wonder is if there is a way to build a smaller example. Does this behavior only appear in large Pallene files? |
No, the timings are different for these files. This is actually a stripped down version of my program. I originally tried to do a much simpler program from scratch, but it did not reproduce the behavior. So I had to go back to my original program and start ripping things out. This all took me a lot of time to do, but I feel the program submitted here is a reasonable compromise. I don't think it as bad as it seems. I believe the performance bottleneck is contained to the function So the timings for these attached files on my computer are as follows: Pallene: (made sure .lua was deleted)
Lua: (made sure .so file was deleted)
P.S. My test program here adds some extra loops to pad out the run time because my real program had a lot more data. |
Discussed in #504
Originally posted by ewmailing November 11, 2021
Hi, I'm not doing any serious benchmarking yet, but I have been spot checking here and there to see what Pallene's performance is like.
So for some context, I've been writing a little stock backtester. The basic idea is you load a bunch of historical stock data and you iterate through it. For each tick, you run some custom rules depending on what ideas you want to test. Usually you will be computing various technical indicators here (which usually involves running through arrays and doing some math operations), and making decisions (branches) to buy and sell.
I speculated that Pallene might do well in this type of use case, particularly the iterating through arrays and doing math. I was hoping that native execution, plus lower overhead (to help things like cache locality and prefetching) might yield good results in Pallene.
As a quick and dirty benchmark reference, I used the Python based Backtesting.py library and used a stripped down version of their hello world tutorial which makes a simple buy/sell decision based on Simple Moving Average (SMA) cross-overs. I then implemented from scratch my Pallene approximate equivalent.
I haven't done much Python development, so I immediately got first hand experience of how slow Python is. Wow. And in this hello world, the SMA arrays are pre-computed at init, so there is no serious math crunching done in the main loop. So this this isn't just an issue with Python big nums being slow, but many other things in Python being slow.
Anyway, I digress. To make measuring a little clearer, I put a for-loop of 1000 around the main loops of the programs to extend out the run times. This would help mitigate the effects of the slow loading times of Python.
So Python took 58 seconds.
Pallene took .3 seconds
Lua (using Pallene's --emit-lua) took .9 seconds.
So Pallene was about 3 times faster than pure Lua. And both were a magnitude faster than Python.
I then wrote a second math heavy experiment that was centered around options trading, which has a lot more number crunching (and uses heavy instructions like math.ln). Unfortunately, Backtesting.py doesn't have any options support, so I could not compare to Python. But Pallene was about 4x faster than Lua in this case.
So after that, I needed to implement some graphs for the results for my backtester so I could visualize the results. I am doing a fast and dirty thing, so I am generating HTML/JS graphs that can be opened in a web browser. So basically my code needs to take all my all my data in a program, and write out a web page with a JavaScript program that generates the desired graph. So there is a lot of string stuff here.
Since there are few string built-ins right now in Pallene, my code is pretty dumb and involves hard coded strings I wrote into my code concatenated with on-the-fly strings, using the .. operator.
What has surprised me about this performance is Pallene seems to be slower in this case than Lua (via --emit-lua). I did compile with -O2 and -O3 to make sure it wasn't just a compiler optimization problem, but Pallene was still slower. And I didn't investigate further, so I don't know if the bottleneck is the actual string code, or the Pallene io.write function.
In Pallene, it took 1.12 seconds.
In Lua, it took 0.27 seconds.
(I did not loop these 1000 times and this measurement was just the graph generation function, and not the entire program, whereas earlier, I was timing the whole program from launch to exit.)
Also I think Pallene's start up time may be a little slower, although I didn't do a clean measurement of start up times. But that start up time is negligible compared the string handling difference. I'm guessing this means loading the .so takes longer than loading/interpreting the emitted Lua script.
Anyway, I was wondering if this performance was known about, and I thought I would mention it if you didn't.
I'm thinking of re-writing/moving the graph generation code to Lua anyway because I think I want to use string.gsub to make it easier for me to insert code into my JavaScript template.
The text was updated successfully, but these errors were encountered: