Replies: 4 comments 15 replies
-
Hey, I can't provide a solution to your problem, however, I can recall a handful of times where people have tried to do interesting things with the Message lock, in the same Function that starts the Orchestration. Unfortunately, often with little success. Generally the advice is so not try to blur the Service Bus message lifetime with the Orchestration lifetime. Put simply - If you can start the Orchestration successfully, then complete the SB message asap. Can I ask the reason why you want to observe the status of the Orchestration and subsequently Abandon the SB message if the Orchestration is not in your desired state? I might be able to provide some other guidance. |
Beta Was this translation helpful? Give feedback.
-
As for why the orchestrator functions aren't starting, try using the Azure Functions Diagnose and Solve tool and searching for "Durable Functions" to see if any known issues are being detected. Note that this tool works best when you're using the latest version(s) of the Durable Functions extension. |
Beta Was this translation helpful? Give feedback.
-
@olitomlinson if the orchestrator takes too long (longer than 1 min), we want to abandon the message so the message will go back to the queue. Maybe we should also terminate the orchestrator instance in that case? @cgillum the exception doesn't seem to occur in the function itself, so it's not detected by this tool |
Beta Was this translation helpful? Give feedback.
-
Yes you could terminate the Orchestration, however bare in mind that any Activities that have already been scheduled by the Orchestration instance will continue to run to completion/failure. In real terms, what this means is that just because you terminated your Orchestration after 60 seconds, there may still be processing occurring, which may or may not be a problem in your use-case. May I ask a question? When 60 seconds elapses, and the message becomes available again on the queue, what then? Are you intending for service bus to redeliver the message and try the whole Orchestration operation again? If so, can I make a suggestion that you shift this retry logic and 60sec timeout logic from ServiceBus/ Service Bus triggered function, and into the Orchestration itself? An Orchestration is a perfect place to elegantly handle timeouts and subsequent retries of an operation, and this will massively simplify the architecture. If there are reasons why you must use Service Bus for redelivery on timeout of the orchestration, I would love to hear more to help understand the use-case :) |
Beta Was this translation helpful? Give feedback.
-
I have a service bus triggered function - it listens to a subscription that is session based. This function then triggers an orchestration function.
The trigger function logic is the following:
At a certain point, the function in all our non-production environments fails to execute the orchestrator. The orchestrator record is created in the instance table, but it keeps status pending for a long time.
This cause the AbandonAsync statement to get triggered - and when this is triggered, we receive SessionLockLostException.
Exception while executing function: xxxxx ---> Microsoft.Azure.ServiceBus.SessionLockLostException : Session lock lost. Accept a new session at async Microsoft.Azure.ServiceBus.Core.MessageReceiver.DisposeMessagesAsync(IEnumerable1 lockTokens,Outcome outcome) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation,TimeSpan operationTimeout) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.ServiceBus.RetryPolicy.RunOperation(Func1 operation,TimeSpan operationTimeout) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.ServiceBus.Core.MessageReceiver.AbandonAsync(String lockToken,IDictionary2 propertiesToModify) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async DeviceManagement.xxxxx.Run(IDurableOrchestrationClient starter,Message message,MessageReceiver messageReceiver) at D:\a\1\s\xxxxx at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.VoidTaskMethodInvoker2.InvokeAsync[TReflected,TReturnType](TReflected instance,Object[] arguments) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\VoidTaskMethodInvoker.cs : 20 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker2.InvokeAsync[TReflected,TReturnValue](Object instance,Object[] arguments) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs : 52 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker,ParameterHelper parameterHelper,CancellationTokenSource timeoutTokenSource,CancellationTokenSource functionCancellationTokenSource,Boolean throwOnTimeout,TimeSpan timerInterval,IFunctionInstance instance) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 572 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance,ParameterHelper parameterHelper,ILogger logger,CancellationTokenSource functionCancellationTokenSource) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 518 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance,FunctionStartedMessage message,FunctionInstanceLogEntry instanceLogEntry,ParameterHelper parameterHelper,ILogger logger,CancellationToken cancellationToken) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 296 End of inner exception at Microsoft.Azure.WebJobs.ServiceBus.SessionMessageProcessor.CompleteProcessingMessageAsync(IMessageSession session,Message message,FunctionResult result,CancellationToken cancellationToken) at async Microsoft.Azure.WebJobs.ServiceBus.Listeners.ServiceBusListener.ProcessSessionMessageAsync(IMessageSession session,Message message,CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.ServiceBus.SessionReceivePump.MessagePumpTaskAsync(IMessageSession session)
The combination of these 2 issues causes us to have a huge amount of active and dead lettered messages in our subscription. Every message abandoned seemed to be retried even though the AbandonAsync threw SessionLockLostException. And it is retried for 10 times until it goes to DLQ.
How can I investigate why my function fails to start the orchestrator? I don't see any exception except the orchestator is being pending for a long time (hours).
How long is session lock active? I tried adding sessionHandler.maxAutoRenewDuration but I saw a couple of issues in Azure that says that the setting is not working, like this: Azure/azure-functions-servicebus-extension#144
What do we need to do when handling SessionLockLostException?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions