-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Activity Function started multiple times before completion #392
Comments
Hey. Any news about this? |
We have noticed a similar issue. In some cases, our Entity function appears to be triggered multiple times without completing. It seems to stop partway through without generating any further error logs or traces. After approximately 3-15 minutes, the Entity will be signalled again and complete successfully. When this issue is not occuring, the whole process end-to-end only takes 100-300ms. This is occurring fairly regularly, so if there are any debug logs we can collect to identify the issue, please let me know. |
@marius-stanescu and @adamconway - what language are you using by the way? @sebastianburckhardt - could you take a look? This seems to be happening for both Orchestrations and Entities when using Netherite. |
C# |
Common causes could include infinite loops in the application code, or out-of-memory issues. There could also be process crashes caused by something not related to the activity, but impacting the activity execution. In all of those cases, it is expected that the activity will be retried again later. Of the error messages you posted, none of the Netherite-internal ones look out of the ordinary, however these are worrisome:
they could indicate that you are using too much CPU on a worker which can cause all kinds of issues (including the ones you are describing). I can look at our internal telemetry if you give me the app name and a time window (looks like I can infer the time window from what you posted earlier) |
Also using C# here. We've recently switched to Netherite, previously this code has been running for multiple years on the Storage Provider without issue which is why we suspect it was something in Netherite rather than application code. I've provided Function app names/IDs, timings, logs etc. via support ticket 2408280040000227. We also noticed if we set the minimum scale of our Function Apps to a higher number, it seems to reduce the impact of the issue (less cancelled executions and shorter delays), but the delay is still in the range of 30-600 seconds in some cases. |
The app name is |
Summary:
We're observing instances where the same Activity Function, is being invoked multiple times before completing.
Additional information:
This Activity Function is started by an Orchestrator that awaits its completion. Although the Activity Function took around 35 seconds to be completely executed (when it succeeded), the Orchestrator's overall duration was more than 35 minutes.
From the correlated dependency logs, each time the Activity Function is invoked, it makes some progress, but it doesn't fully finish. There are no exceptions or errors during the processing; the Function seems to just "die" all of a sudden, and it is started again at a later time. It finally completed the execution more than 33 minutes after it was first invoked.
Logs:
The series of logs from the execution environment that show the recurring invocation of the Activity Function without successful completion:
At the same time, we observed various types of warnings, such as performance issues, storage conflicts, timeouts, and resource utilization alerts, that might be related to the unexpected behaviour described above. Here are some examples of such warnings:
What could be the cause of the Activity Function not being completed, and started over and over again? And how can this be prevented?
The text was updated successfully, but these errors were encountered: