Replies: 4 comments 8 replies
-
400s (6-7 minutes) for a sub-orchestration seems really excessive. The only reason for performance this bad is that something is crashing. I would definitely look into the memory usage of your app. I also recommend going through our troubleshooting guide, if you haven't already. Note the part about using Azure Function app diagnostics to help diagnose common problems. What size payloads are you passing between activities and sub-orchestrations? Sending large payloads can have a dramatic impact on performance. Lastly, you should not be running any production apps on Functions v3, since that version is out of support. Please upgrade your runtime to Functions V4 ASAP. |
Beta Was this translation helpful? Give feedback.
-
Hello @cgillum , is there anything else we should look into? |
Beta Was this translation helpful? Give feedback.
-
is there any update on this thread? I’m interested since facing a similar huge CPU usage |
Beta Was this translation helpful? Give feedback.
-
My understanding is that hitting the CPU hard via CPU-intensive loops and such causes some sort of storage resource to crash and retry after a delay. #1687 In my durable function, I read a ton of data via API, and this works great. Then when I hit a loop to build 10k large objects in memory with this data, the activity or something supporting it crashes dozens of times before finally getting lucky and succeeding usually 40-60 minutes later. I've been trying to throttle the CPU usage with Thread.Sleep(), but it only seems to make it worse in the cloud. |
Beta Was this translation helpful? Give feedback.
-
Hello team,
We have a function app running on dedicated app service plan (I1V2 with 3 instances). We recently added a durable function which would fan-out to ~100 sub orchestration functions, and eventually results in ~ 200 activity functions (1 sub-orch to 2 activitys). We tried to concurrently 'enqueue' 800 such durable functions and observed dramatic performance degradation and excessive amount of CPU usage.
Each activity function seems to execute relatively fast (~ 30 - 155ms), but the sub orchestrations seem to take 400s to complete on average, which eventually slow down the completion of the top-level orchestrations (took 950s to complete). The CPU usage on each worker spikes to > 80% when the durable functions are running.
We took a profiler trace on the worker and most of the CPU clocks seem to be spent on durable storage related modules (> 50%), while the CPU spent on our function module was relatively small (~ 7%).
Given our scale (800 concurrent durables * 100 sub-orchestraction * 2 activity function each) and 3 I1V2 workers, is this expected (perf and resource utilization wise)? Anything we can do to improve it (besides scaling up & out the ASP)?
Some other information that might help:
We are on .NET 6 Azure function V4 in-process mode with Microsoft.Azure.WebJobs.Extensions.DurableTask 2.11.0
We are using the default host.json, but we did try to tweak it a little bit by:
But the default still seems to be the most performant setting.
Beta Was this translation helpful? Give feedback.
All reactions