Need help to find out drop in performance after updated to azure function durable extension v2.6.1 (from v2.5.1) #2358
Replies: 1 comment 1 reply
-
It's hard to say what the source of the slowness, or whether it's even related to the specific version update. Both versions you're referring to are quite old, though, so I wonder if it would be better to try upgrading to v2.9.0 (the latest as of right now) instead of v2.6.1. That said, split-brain errors are pretty much never expected, so something appears to be off. What configuration values do you have in your host.json file? Also, are you doing any auto-scaling of your environment? Split-brain was more common back when we used an older partition management scheme and when the number of instance hosting an app changed (e.g. scaling from 1 to 10), but we've since defaulted to a cooperative partition management strategy that should get rid of these types of problems. The other time I've seen split-brain is when multiple instances of the app are running on the same machine - i.e. the machine name environment variable is the same. This confuses our lease mechanism, though it's typically something you see when trying to run multiple instances on a local machine. I assume a containerized app wouldn't have this problem. Lastly, is this problem persistent, or does it eventually go away? If it eventually goes away, that might suggest it is related to some change in the update, though I didn't see anything obvious when scanning the release notes for v2.6.0 and v2.6.1. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I would like to ask for some help or guidance in checking why the drop in performance after updated my app (running inside azure kubernetes) with durable extension v2.6.1 (from v2.5.1), there is no any code change nor configuration changes apart from the durable extension library version change (Didn't update to the latest durable extension due to the issue filed in discussion#2335).
Below is my app info:
Azure Region: West US
App info: C# .NET 6.0 on Azure Kubernetes (1 Orchestrator function with 3 activity chaining pattern)
ApplicationInsights Name: ncppfc01aiseus01
Storage Account Name: ncppfc01xenstrwus02
Load test is done by using jmeter to fire 8000 requests per minute to the app, for 15 minutes. Result and (timing) response from each request is captured and benchmarked.
Load test session (with durable extension v2.6.1) is between 14-Jan-2023, 09:38:00 to 10:07:00 UTC
Sample orchestration instance id:
b921a973-8728-4c32-b37b-ac2b3513e721
(AppInsights link)Load test session (with durable extension v2.5.1) is between 14-Jan-2023, 08:06:00 to 08:35:00 UTC
Sample orchestration instance id:
e9c62ccc-de90-4954-913a-e34645e40ec8
(AppInsights link)Here is the AppInsights E2E timeline visualisation, I would like to understand why there orchestrator is slow when the app is updated with durable extension v2.6.1.
I can see some high delay, PendingOrchestratorMessageLimitReached, and split brain traces in the trace message. The occurrence of such trace is much higher with durable extension v2.6.1 than v2.5.1 with same configuration and same loadtest condition. Is there any parameter we need to update after updated to durable extension v2.6.1?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions