Random Non-Deterministic exception #2347
-
While executing a certain orchestration, the following error will randomly appear:
The code isn't changing during execution. It shouldn't even be triggering a timer. No timers are created in this orchestration. Yet they're being created for unknown reasons. I've attached an export from the history table for the orchestration in question. I've replaced the return values from This history doesn't make any sense to me. Any suggestions on how to debug this issue? I've never been able to reproduce it locally, but it happens in prod frequently enough to be a problem. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Also, sometimes it's the other way around, where a later run tries to create a timer when it shouldn't:
But the one thing is that it's consistently sequence number 2. |
Beta Was this translation helpful? Give feedback.
-
I found the root cause after accidently reproducing the issue. It was indeed a non deterministic error, but a stranger one than I suspect the authors of durable functions anticipated. The problem occurs when durable functions is unable to deserialize the json response from an activity. The reason this happens and appeared to be random is that a dll needed to deserialize the response isn't loaded properly at the time of playback if the orchestration is the first to run in a new process. It's dynamically loaded earlier in the flow, making the entire thing technically non-deterministic. I'll need to take find a better way to ensure the needed dlls are all loaded in time for the orchestration. So my main feedback for the team would be that it would be nice if the non-deterministic errors could have some different phrasing when implicit timers via retries are involved. The only reason I knew to look at failures and retries was another issue or discussion I found while looking for help. |
Beta Was this translation helpful? Give feedback.
I found the root cause after accidently reproducing the issue. It was indeed a non deterministic error, but a stranger one than I suspect the authors of durable functions anticipated.
The problem occurs when durable functions is unable to deserialize the json response from an activity. The reason this happens and appeared to be random is that a dll needed to deserialize the response isn't loaded properly at the time of playback if the orchestration is the first to run in a new process. It's dynamically loaded earlier in the flow, making the entire thing technically non-deterministic. I'll need to take find a better way to ensure the needed dlls are all loaded in time for the orchestration.
…