Skip to content

Conversation

@dandavison
Copy link
Contributor

@dandavison dandavison commented Dec 7, 2025

What changed?

  • Add new public gRPC methods DescribeActivityExecution and GetActivityExecutionOutcome. See New Get-Info / Get-Result / long-poll design api#673
  • These replace the previous PollActivityExecution
  • Respond to additional API changes: inlining of ActivityOptions
  • Configure quota for the new methods in their blocking and non-blocking forms
  • Update test suite

Why?

  • Implements agreed Standalone Activity design

How did you test it?

  • built
  • covered by existing tests
  • added new functional test(s)

Note

Introduce DescribeActivityExecution and GetActivityExecutionOutcome APIs (replacing PollActivityExecution), inline ActivityOptions in StartActivityExecution, and update plumbing, quotas, and tests accordingly.

  • Public API changes:
    • Replace workflowservice.PollActivityExecution with DescribeActivityExecution and GetActivityExecutionOutcome (long-poll support via LongPollToken).
    • Inline ActivityOptions fields in StartActivityExecutionRequest and propagate via helpers.
  • Server/Frontend:
    • Add handlers, validation (ValidateDescribeActivityExecutionRequest, ValidateGetActivityExecutionOutcomeRequest), and long-poll behavior.
    • Update rate limiting, redirection, metadata, and quotas (new poll aliases; priority classifications).
  • Activity component:
    • Rename ScheduledTimeScheduleTime throughout; adjust retry deadline calc.
    • Add outcome helper and buildGetActivityExecutionOutcomeResponse; refactor describe response builder.
  • Protos/clients (generated):
    • Update proto messages/services, gRPC stubs, layered/metric/retryable clients, and mocks to new methods and fields.
  • Tests:
    • Migrate tests from PollActivityExecution to DescribeActivityExecution/GetActivityExecutionOutcome; adapt assertions to new fields.
  • Deps:
    • Bump go.temporal.io/api version.

Written by Cursor Bugbot for commit 32cd5cf. This will update automatically on new commits. Configure here.

@dandavison dandavison requested review from a team as code owners December 7, 2025 15:38
@dandavison dandavison changed the title Standalone Activity: DescribeActivityExecution and GetActicityExecutionOutcome Standalone Activity: DescribeActivityExecution and GetActivityExecutionOutcome Dec 7, 2025
@dandavison dandavison force-pushed the standalone-activity-describe-and-get-outcome branch 3 times, most recently from 084a15f to 938ff87 Compare December 8, 2025 00:35
Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of my comments are very minor. I don't feel like I need to re-review. Please address and merge.

// The current attempt number for this activity execution. Since task validation/exec happen outside of a lock, we
// need to guard against any concurrent operations where the originally intended task may be outdated.
int32 attempt = 1;
// The current attempt number for this activity execution. Since task validation/exec happen outside of a lock, we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a linter/formatter so this doesn't happen again. Maybe open an issue and we'll tackle it at some later point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// getOutcome retrieves the activity outcome (result or failure) if the activity has completed.
// Returns (result, failure, error) where at most one of result/failure is non-nil.
func (a *Activity) getOutcome(ctx chasm.Context) (*commonpb.Payloads, *failurepb.Failure, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use get for getters in Go: https://go.dev/doc/effective_go#Getters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! changed to outcome

RunId: key.RunID,
RunState: runState,
ScheduledTime: a.GetScheduledTime(),
ScheduleTime: a.GetScheduledTime(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well rename to schedule_time in the internal protos too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return nil, err
}
if result != nil {
response.Outcome = &workflowservice.DescribeActivityExecutionResponse_Result{Result: result}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add an Outcome message in the protos and use it in both APIs instead of having a separate oneof in each response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done, pushed to proto repos. Named it ActivityExecutionOutcome but lmk if you want it to be just Outcome.

return successful.GetOutput(), nil, nil
}
// Check for failure in outcome
if failure := activityOutcome.GetFailed().GetFailure(); failure != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify that we don't set failure to an empty struct when we fail from an attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were still doing that (I'd brought it up before but we hadn't removed it). I removed it in 32cd5cf (PTAL).

Comment on lines +257 to +270
func activityOptionsFromStartRequest(req *workflowservice.StartActivityExecutionRequest) *apiactivitypb.ActivityOptions {
return &apiactivitypb.ActivityOptions{
TaskQueue: req.TaskQueue,
ScheduleToCloseTimeout: req.ScheduleToCloseTimeout,
ScheduleToStartTimeout: req.ScheduleToStartTimeout,
StartToCloseTimeout: req.StartToCloseTimeout,
HeartbeatTimeout: req.HeartbeatTimeout,
RetryPolicy: req.RetryPolicy,
}
}

// applyActivityOptionsToStartRequest copies normalized values from ActivityOptions
// back to the StartActivityExecutionRequest.
func applyActivityOptionsToStartRequest(opts *apiactivitypb.ActivityOptions, req *workflowservice.StartActivityExecutionRequest) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass in the request into the ValidateAndNormalizeActivityAttributes function instead of using this options struct which isn't part of the API? Is that function used for the workflow command and the activity operator APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's because it's shared with workflow code. I did try to say that in the comment above for use with shared validation logic. Anyone got better ideas here?

}, req)
default:
return nil, serviceerror.NewInvalidArgumentf("unexpected wait policy type: %T", waitPolicy)
if ctx.Err() != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it during the weekend and I think we should only do this check if err != nil just in case there's a race and we could potentially fulfill the response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Looks like Cursorbot agrees (it had taken that position it out in a previous PR)

len(req.GetActivityId()), maxIDLengthLimit)
}
if runID := req.GetRunId(); runID != "" {
_, err := uuid.Parse(runID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we typically validate that run IDs are uuids in our API handlers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompleteNexusOperation = "/temporal.api.nexusservice.v1.NexusService/CompleteNexusOperation"
// PollWorkflowHistoryAPIName is used instead of GetWorkflowExecutionHistory if WaitNewEvent is true in request.
PollWorkflowHistoryAPIName = "/temporal.api.workflowservice.v1.WorkflowService/PollWorkflowExecutionHistory"
// PollActivityExecutionAPIName is used instead of GetActivityExecutionOutcome if LongPollToken is set in request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// PollActivityExecutionAPIName is used instead of GetActivityExecutionOutcome if LongPollToken is set in request.
// PollActivityExecutionAPIName is used instead of DescribeActivityExecution if LongPollToken is set in request.

// PollWorkflowHistoryAPIName is used instead of GetWorkflowExecutionHistory if WaitNewEvent is true in request.
PollWorkflowHistoryAPIName = "/temporal.api.workflowservice.v1.WorkflowService/PollWorkflowExecutionHistory"
// PollActivityExecutionAPIName is used instead of GetActivityExecutionOutcome if LongPollToken is set in request.
PollActivityExecutionAPIName = "/temporal.api.workflowservice.v1.WorkflowService/PollActivityExecution"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think being more specific is preferable here.

Suggested change
PollActivityExecutionAPIName = "/temporal.api.workflowservice.v1.WorkflowService/PollActivityExecution"
PollActivityExecutionAPIName = "/temporal.api.workflowservice.v1.WorkflowService/PollActivityExecutionDescription"

Base automatically changed from standalone-activity-post-main-merged to standalone-activity-with-main-merged December 8, 2025 20:58
@dandavison dandavison changed the base branch from standalone-activity-with-main-merged to standalone-activity December 8, 2025 21:38
@dandavison dandavison force-pushed the standalone-activity-describe-and-get-outcome branch from 938ff87 to 4ba7671 Compare December 8, 2025 21:38
if shouldHaveFailure {
if details := a.LastAttempt.Get(ctx).GetLastFailureDetails(); details != nil {
return nil, details.GetFailure(), nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incorrect outcome returned for terminated/timed-out activities

The getOutcome function incorrectly returns the last attempt's failure for ACTIVITY_EXECUTION_STATUS_TERMINATED and ACTIVITY_EXECUTION_STATUS_TIMED_OUT statuses. As noted in the PR review, terminated and timed out activities should never return outcome from the last attempt - the outcome should only be derived from activityOutcome.GetFailed().GetFailure() for these statuses. Only ACTIVITY_EXECUTION_STATUS_FAILED and ACTIVITY_EXECUTION_STATUS_CANCELED (sometimes) should fall back to reading from LastAttempt.GetLastFailureDetails().

Fix in Cursor Fix in Web

return response, true, err
token := req.GetFrontendRequest().GetLongPollToken()
if len(token) == 0 {
return chasm.ReadComponent(ctx, ref, (*Activity).buildDescribeActivityExecutionResponse, req, nil)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Deadline buffer applied to non-long-poll describe requests

In DescribeActivityExecution, contextutil.WithDeadlineBuffer is called unconditionally before checking if the request is a long-poll (via LongPollToken). The old PollActivityExecution code returned early for non-long-poll requests BEFORE applying the deadline buffer. Now, non-long-poll describe requests have their context deadline capped at LongPollTimeout and reduced by LongPollBuffer, which could cause unexpected timeouts for regular describe calls that previously had longer deadlines.

Fix in Cursor Fix in Web

@dandavison dandavison merged commit 8a7d26e into standalone-activity Dec 8, 2025
14 checks passed
@dandavison dandavison deleted the standalone-activity-describe-and-get-outcome branch December 8, 2025 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants