-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solve performance issues #592
Comments
Since I see a mention of memcmp_avx2 would it help if we would run afserver on a CPU with avx512 support or does this make no difference? |
Hello! |
Hey Timur, |
hostmask comes first |
Would adding pools instead of hostmasks to a job increase its performance? For example, Berlin: 100, rest: 0? Currently, we have a pool per site, but we don't assign them to jobs. Since our hostnames vary per site, we ensure that a job runs on the intended site by only setting a hostmask. I mean, if we need to check a Job against ~10 pools, vs a jobs hostsmask against ~1000 nodes it should increase our performance, I would expect. Thanks, |
Hey Timur, we have also noticed that the cpu load of afserver goes up just by having lots of "DON" jobs. we usually have 1000 RDY jobs and sometime between 4000-10000 DON jobs and they seem to have a huge impact on the server performance. shouldn't those jobs be skipped really fast? is there anything that the server needs to check on them? cheers |
Hello!
|
Hi Timur, since we are really struggling with many DON jobs right now (that we really need to keep) I wanted to ask if your proposal could be implemented in the near future? maybe the next 1-2 months? We already run automatic job cleanup-scripts and we cannot delete any more jobs because artists still need them. this would certainly help reduce the pressure on our farm enormously. |
i have created a follow up ticket regarding multi-threading improvement over here #594 |
Our afserver currently uses the std:regex library to match the hostmasks, but can also use the posix regex libary. The former is "new" for us under almalinux 9, the latter was previously used by us on centos 6 (and maybe partly centos 7). I now had a look at the regex library speeds of std::regex and posix regex. It turns out, what the rest of the internet also says, that std::regex is much slower. factor 2-3. i then built a comparison with my incredible c++ knowledge (and maybe a bit of ChatGPT). Result: with our hostmasks the difference is clear, posix is 3x faster.
Posix regex is built into af, but you can't control which one it uses. it decides itself at compile time based on the compiler version. So I would have to add the option to change via command line. feature-wise the engines are the same - at least for our usecases. we're even more likely to go back to our "old setup", because the fact that std::regex was used instead of posix is "rather coincidental" due to the no-so-transparent selection process with the upgrade to alma. @timurhai do you see any issue arising from changing the posix lib even when under std-c++11 compile flag? |
Yes, we can add some |
We can also implement some ps |
Hello! Take a look: |
Yes! It can help, but like I wrote there it will also limit a few users, so we need to talk about it internally. As you can see above in the perf screenshots most time is spent in "hasTickets". Maybe you have an idea how to speed that up additionally? I think the main issue comes from having to walk all the way for each task to each render to each pool the render is in. many lookups in the ticket maps, which take long. |
Interesting idea, but regex matching is not our problem at RISE. in our case the hasTickets function is still the bottleneck. maybe we should rather look into this issue > #596 |
Hello we are running into performance issues where the solve thread on the server is running at 100% and at some point the server is crashing.
I attached
perf
to the server and found that it is using a lot of its time in "hasTicket", "hasPoolTicket" and "hasHostTickets" calls, as well as a bit less time in regex matching. In the next images you can see the cumulated times used in functions. First read from bottom up, which is the start of the afserver. We are also only looking at the solve thread. The dark blue underlined bars are the time spend in regexes, the light blue bars are the time spend in hasXXXTickets. Especially the ::find(string) calls of the underlying ticket map seem to hit hard.The next images are inverted, so the long calls are at the bottom. First image hightligts the "hasXXXTicket", the second image hightlights the hostmask regexes.
The broader a bar is at the bottom the longer the function took cumulatively. You can see map::find() takes quite some time cumulatively
How can we speed up the hasTicket queries?
Can we do something about how afserver is compiled?
Can we switch out the implementation to something faster (I think unordered_map is faster in access).
Do you have any other ideas?
Parallelize some things?
Regards
Sebastian
The text was updated successfully, but these errors were encountered: