-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix continueAsNew with large snapshot(>4MB) #482
Conversation
dda96dd
to
72271cf
Compare
- name: Dump docker logs | ||
if: always() | ||
uses: jwalton/gh-docker-logs@v2 | ||
# TODO cadence sticky test is flaky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @lwolczynski disabled this as you saw today that it started to be flaky.
Having the sticky disabled for Temporal should be enough for protecting determinism.
// NOTE: using DeterministicKeys so that the JSON snapshot for continueAsNew is stable for pagination | ||
// TODO: we should use DeterministicKeys for every map iteration in interpreter for safety | ||
// https://github.com/indeedeng/iwf/issues/510 | ||
for _, k := range DeterministicKeys(am.searchAttributes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary otherwise the snapshot will have different order for different pages.
It wasn't a problem becase I didn't test more than one page for continueAsNew (and there wasn't prod use case running into this, luckily)
StaleSkipTimerSignals: c.timerProcessor.Dump(), | ||
}, nil | ||
}) | ||
return c.provider.SetQueryHandler(ctx, service.ContinueAsNewDumpQueryType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time understanding how/where this pagination code (SetQueryHandler) is called when the workflow is queried. What is the code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to use IDE to browse the code path:
- workflowImpl.go:
if input.IsResumeFromContinueAsNew { LoadInternalsFromPreviousRun(...) }
- continueAsNewer.go:
provider.ExecuteActivity(&resp, false, ctx, DumpWorkflowInternal...
- activityImpl.go:
apiClient.DefaultApi.ApiV1WorkflowInternalDumpPost(ctx)
- service.go:
s.client.QueryWorkflow(ctx, &pageOfSnapshot, request.GetWorkflowId(), request.GetWorkflowRunId(), service.ContinueAsNewDumpQueryType, request)
- continueAsNewer.go : the registered handler of
SetQueryHandler(ctx, service.ContinueAsNewDumpQueryType,
cccbf89
to
8f09ad8
Compare
assertions := assert.New(t) | ||
|
||
// start test workflow server | ||
wfHandler := signal.NewHandler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Wouldn't it be better to create a handler class for this specific test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like using the existing is better. The test only need a workflow to do nothing, so that we can set some large DAs. The existing signal workflow is exactly like that. I would just copy the code if so.
str, err := json.Marshal(req) | ||
if len(str) > 1000 { | ||
str = str[0:1000] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this logging went to Datadog, is there a specific reason why we need to limit its size? I don't see a reason to limit this if it's not going to Temporal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the request object. Logging will be too expensive too if the req is too big (>100 bytes).
This one is actually not going to Datadog because we have another internal handler (see handler.go
in internal repo).
I wanted to limit this otherwise my local dev will look pretty bad when sending the 1MB request.
if request.PageSizeInBytes > 0 { | ||
pageSize = request.PageSizeInBytes | ||
} | ||
lenInDouble := float64(len(wholeData)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need to declare this since it's only used in line 151. I also believe they could be float32s since you are really dividing ints and just doing type conversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump sleep add test rename fix static check tweak config done minor fix done fix comment fix test disable cadence test reduce test load reduce test load reduce test load fix test remove test remove test use copy
e9da04b
to
5e066b5
Compare
Description
Checklist
Related Issue
Closes #397