Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add workflow_start_delay to StartChildWorkflowExecutionCommand #441

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions openapi/openapiv2.json
Original file line number Diff line number Diff line change
Expand Up @@ -10208,6 +10208,10 @@
"inheritBuildId": {
"type": "boolean",
"description": "If this is set, the child workflow inherits the Build ID of the parent. Otherwise, the assignment\nrules of the child's Task Queue will be used to independently assign a Build ID to it."
},
"workflowStartDelay": {
"type": "string",
"description": "Time to wait before dispatching the child workflow's first task. Cannot\nbe used with `cron_schedule`. If the workflow gets a signal before the\ndelay, a workflow task will be dispatched and the rest of the delay will\nbe ignored."
}
}
},
Expand Down Expand Up @@ -10325,6 +10329,10 @@
"inheritBuildId": {
"type": "boolean",
"description": "If this is set, the child workflow inherits the Build ID of the parent. Otherwise, the assignment\nrules of the child's Task Queue will be used to independently assign a Build ID to it."
},
"workflowStartDelay": {
"type": "string",
"description": "Time to wait before dispatching the child workflow's first task. Cannot\nbe used with `cron_schedule`. If the workflow gets a signal before the\ndelay, a workflow task will be dispatched and the rest of the delay will\nbe ignored."
}
}
},
Expand Down
8 changes: 8 additions & 0 deletions openapi/openapiv3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7925,6 +7925,14 @@ components:
description: |-
If this is set, the child workflow inherits the Build ID of the parent. Otherwise, the assignment
rules of the child's Task Queue will be used to independently assign a Build ID to it.
workflowStartDelay:
pattern: ^-?(?:0|[1-9][0-9]{0,11})(?:\.[0-9]{1,9})?s$
type: string
description: |-
Time to wait before dispatching the child workflow's first task. Cannot
be used with `cron_schedule`. If the workflow gets a signal before the
delay, a workflow task will be dispatched and the rest of the delay will
be ignored.
StartWorkflowExecutionRequest:
type: object
properties:
Expand Down
5 changes: 5 additions & 0 deletions temporal/api/command/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,11 @@ message StartChildWorkflowExecutionCommandAttributes {
// If this is set, the child workflow inherits the Build ID of the parent. Otherwise, the assignment
// rules of the child's Task Queue will be used to independently assign a Build ID to it.
bool inherit_build_id = 17;
// Time to wait before dispatching the child workflow's first task. Cannot
// be used with `cron_schedule`. If the workflow gets a signal before the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think signal won't/shouldn't cause the workflow task to be dispatched.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signal on the parent workflow shouldn't cause it to be dispatched, but I'd expect a signal on the child workflow that's started with a delay to cause it to immediately start (same as existing WF logic), is that incorrect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Child workflow is still a workflow, if it has a start delay, then signal will not trigger a workflow task to be dispatched thus unblocking the workflow.

Are you saying ^ is not the case? If so, it's a bug we need to fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying ^ is not the case?

Yes, per StartWorkflowExecutionRequest.workflow_start_delay docs, the behavior is that a signal interrupts the delay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the comment. If what the comment says is true, that needs to be fixed.

But I just tried this locally and signal won't interrupt the start delay.

➜ temporal (main) ✔ temporal workflow start --type TestWorkflow -w test-workflow-id -t test-task-queue --start-delay 1h
Running execution:
  WorkflowId  test-workflow-id
  RunId       c619050c-adf0-4ed3-acaa-d1b1f8607320
  Type        TestWorkflow
  Namespace   default
  TaskQueue   test-task-queue

➜ temporal (main) ✔ ./tctl wf show --wid test-workflow-id                      
  1  WorkflowExecutionStarted  {WorkflowType:{Name:TestWorkflow}, ParentInitiatedEventId:0,  
                                TaskQueue:{Name:test-task-queue, Kind:Normal},                
                                WorkflowExecutionTimeout:0s, WorkflowRunTimeout:0s,           
                                WorkflowTaskTimeout:10s, Initiator:Unspecified,               
                                OriginalExecutionRunId:c619050c-adf0-4ed3-acaa-d1b1f8607320,  
                                Identity:temporal-cli:[email protected],       
                                FirstExecutionRunId:c619050c-adf0-4ed3-acaa-d1b1f8607320,     
                                Attempt:1, FirstWorkflowTaskBackoff:1h0m0s,                   
                                ParentInitiatedEventVersion:0}                                
   
➜ temporal (main) ✔ ./tctl wf signal --wid test-workflow-id --rid c619050c-adf0-4ed3-acaa-d1b1f8607320 --name test-signal
Signal workflow succeeded.

➜ temporal (main) ✔ ./tctl wf show --wid test-workflow-id                                                              
  1  WorkflowExecutionStarted   {WorkflowType:{Name:TestWorkflow}, ParentInitiatedEventId:0,  
                                 TaskQueue:{Name:test-task-queue, Kind:Normal},                
                                 WorkflowExecutionTimeout:0s, WorkflowRunTimeout:0s,           
                                 WorkflowTaskTimeout:10s, Initiator:Unspecified,               
                                 OriginalExecutionRunId:c619050c-adf0-4ed3-acaa-d1b1f8607320,  
                                 Identity:temporal-cli:[email protected],       
                                 FirstExecutionRunId:c619050c-adf0-4ed3-acaa-d1b1f8607320,     
                                 Attempt:1, FirstWorkflowTaskBackoff:1h0m0s,                   
                                 ParentInitiatedEventVersion:0}                                
  2  WorkflowExecutionSignaled  {SignalName:test-signal,                                      
                                 Identity:[email protected]}           

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same results as you:


$ temporal workflow show -w parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d        
Progress:
  ID           Time                     Type           
    1  2024-08-14T22:24:33Z  WorkflowExecutionStarted  
$ temporal workflow describe -w parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d 
Execution Info:
  WorkflowId            parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d
  RunId                 a3560511-7678-44fb-a9a4-22a7cbea641d
  Type                  SampleParentWorkflow
  Namespace             default
  TaskQueue             child-workflow
  StartTime             2 minutes ago
  ExecutionTime         7 minutes from now
  StateTransitionCount  1
  HistoryLength         1
  HistorySize           257

Pending Activities: 0
Pending Child Workflows: 0
$ temporal workflow signal -w parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d --name "whatever-signal"
Signal workflow succeeded
$ temporal workflow show -w parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d                                                                                                                                           
Progress:
  ID           Time                     Type           
    1  2024-08-14T22:24:33Z  WorkflowExecutionStarted  
    2  2024-08-14T22:26:56Z  WorkflowExecutionSignaled 

$ temporal workflow describe -w parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d                                                                                                                                       
Execution Info:
  WorkflowId            parent-workflow_2d80ca74-86ff-43b1-a540-63e5e8d59e8d
  RunId                 a3560511-7678-44fb-a9a4-22a7cbea641d
  Type                  SampleParentWorkflow
  Namespace             default
  TaskQueue             child-workflow
  StartTime             2 minutes ago
  ExecutionTime         7 minutes from now
  StateTransitionCount  2
  HistoryLength         2
  HistorySize           347

Pending Activities: 0
Pending Child Workflows: 0

I'll update the comments on the existing API as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my own curiosity, found where SignalWorkflow handles the backoff WFT

Copy link
Member

@cretz cretz Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It appears the original form from temporalio/temporal#3984 didn't do this, and temporalio/temporal#4091 came a month later to change the behavior and the docs never got updated.

Many of our SDKs used these docs to write their own, e.g https://pkg.go.dev/go.temporal.io/sdk/internal#StartWorkflowOptions.StartDelay:

If the workflow gets a signal before the delay, a workflow task will be dispatched and the rest of the delay will be ignored.

And https://www.javadoc.io/doc/io.temporal/temporal-sdk/latest/io/temporal/client/WorkflowOptions.Builder.html and so on. So I will make a note to update those SDK docs too.

Copy link
Member

@yycptt yycptt Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual situation is more complex than this...

  • SignalWorkflowExecution will never unblock (basically ignoring skip_generate_workflow_task field in the request when workflow needs a delay).
  • SignalWithStart with skip_generate_workflow_task = true will not unblock. SignalWithStart with skip_generate_workflow_task = false will unblock.

I am thinking we should fix SignalWithStart behavior to match Signal. That's a bug we know about but not fixing for a long time so that we have a way to unblock workflow with delay (cron/retry/start_delay). But now that I think about it, we should just introduce a new tdbg command for unblocking the workflow.

Changing the behavior of SignalWithStart of server side could be regarded as breaking, but my understanding is that very few people is relying on that. We can add a feature flag in for the change, default to the new behavior (or old behavior, and new behavior in release X+1), and call out this change in release note.

Before we make ^ change, I think we should change the comment for all start delay fields to be the same as the one we have in the SignalWithStart request

// Time to wait before dispatching the first workflow task. Cannot be used with cron_schedule.
// Note that the signal will be delivered with the first workflow task. If the workflow gets
// another SignalWithStartWorkflow before the delay and skip_generate_workflow_task is false
// or not set, a workflow task will be dispatched immediately and the rest of the delay period
// will be ignored, even if that request also had a delay. Signal via SignalWorkflowExecution
// will not unblock the workflow.

// delay, a workflow task will be dispatched and the rest of the delay will
// be ignored.
google.protobuf.Duration workflow_start_delay = 18;
}

message ProtocolMessageCommandAttributes {
Expand Down
5 changes: 5 additions & 0 deletions temporal/api/history/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,11 @@ message StartChildWorkflowExecutionInitiatedEventAttributes {
// If this is set, the child workflow inherits the Build ID of the parent. Otherwise, the assignment
// rules of the child's Task Queue will be used to independently assign a Build ID to it.
bool inherit_build_id = 19;
// Time to wait before dispatching the child workflow's first task. Cannot
// be used with `cron_schedule`. If the workflow gets a signal before the
// delay, a workflow task will be dispatched and the rest of the delay will
// be ignored.
google.protobuf.Duration workflow_start_delay = 20;
}

message StartChildWorkflowExecutionFailedEventAttributes {
Expand Down
Loading