Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF] Simple workflow with nested ForEach taking a while to run #6160

Open
badsyntax opened this issue Nov 28, 2024 · 1 comment
Open

[PERF] Simple workflow with nested ForEach taking a while to run #6160

badsyntax opened this issue Nov 28, 2024 · 1 comment
Milestone

Comments

@badsyntax
Copy link

Performance Improvement Request

Performance Issue Overview

Hi there, thanks for the great library. We've noticed some perf issues that we'd like clarification on. We're using Elsa to process API responses that typically involve something like this:

Screenshot 2024-11-28 at 11 21 01

In cases where we have a massive data set, eg 1300+ items in a list, then the workflow takes a long time to complete (longer than 30 seconds). Most of the time is spent in the ForEach activity.

Benchmarks and Metrics

I've created an example repo that demostrates the perf issue: https://github.com/badsyntax/ElsaRepo

For this example, it processes 1300 items in a list, and takes 10 seconds to run on my MacBook Pro M1.

We've done some basic profiling using Rider, and it looks like it's making more than a billion calls to GetDescendants

9.8% < GetDescendants>b_0 • 27,141 ms / 27,191 ms • 1 081 734 156 calls • Elsa.Extensions.ActivityExecutionContextExtensions+<>c_DisplayClass22_0.<GetDescendants>b_0(ActivityExecutionCon
Screenshot 2024-11-28 at 11 12 21

Looking at a much simpler example, it processes 1300 items and takes +-170ms to run on my MacBook Pro M1.

It calls GetDescendents 5 million times.

Screenshot 2024-11-28 at 11 14 32

We're not very experienced with profiling, so we might be reading this all wrong, but it certainly is taking longer than we expected and hoped, and it does seem excessive to be making so many calls.

Here's the rider profiling dumps

ElsaProfiling.zip

Additional Context

Happy to provide any additional context or information you require. Is there any way this can be optimised at all?

@badsyntax badsyntax changed the title [PERF] Simple workflow with nested ForEeach taking a while to run [PERF] Simple workflow with nested ForEach taking a while to run Nov 29, 2024
@badsyntax
Copy link
Author

badsyntax commented Nov 30, 2024

The problem is within the MapAsync method of DefaultActivityExecutionMapper:

public async Task<ActivityExecutionRecord> MapAsync(ActivityExecutionContext source)

MapAsync is called for each activity context:

var activityExecutionContexts = context.ActivityExecutionContexts;
var tasks = activityExecutionContexts.Select(activityExecutionMapper.MapAsync).ToList();

For each activity context, we find the descendants, and this is what's slowing everything down:

private static ActivityStatus GetAggregateStatus(ActivityExecutionContext context)
{
// If any child activity is faulted, the aggregate status is faulted.
var descendantContexts = context.GetDescendants().ToList();
if (descendantContexts.Any(x => x.Status == ActivityStatus.Faulted))
return ActivityStatus.Faulted;
return context.Status;
}

I attempted to improve the perf here but wasn't able to come up with anything effective. I do reckon this can optimised somehow. The problem is that there's just so many activity contexts and calling GetDescendants recursively to determine the actvity status is an expensive operation, which is only evident when working large data sets in a foreach. The perf problem is exacerbated what seems to be exponntially by adding more activies or by adding more items to the ForEach activity.

I've been able to find a workaround that improves the perf rather significantly by not enabling ActivityExecutionLogPersistence && WorkflowExecutionLogPersistence middlewares. By doing so, the example goes from 10s to 2s for me, and in other example it goes from 16s to 5.5s, which is great, as we don't require log persistence.

Here's an example:

services.AddElsa(elsa =>
{
    elsa.UseWorkflows(workflows =>
    {
        workflows.WithWorkflowExecutionPipeline(pipeline => pipeline
            .Reset()
            .UseEngineExceptionHandling()
            .UseExceptionHandling()
            .UseDefaultActivityScheduler());
    });
});

@sfmskywalker sfmskywalker moved this to Triage in ELSA 3 Dec 12, 2024
@sfmskywalker sfmskywalker added this to the Elsa 3.4 milestone Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Triage
Development

No branches or pull requests

2 participants