Conversation
…ite key lookups - Create OperationKey record in execution/ package with parentId/operationId fields and static factory methods of() and fromOperation() - Refactor ExecutionManager.operations map from Map<String, Operation> to Map<OperationKey, Operation> for scoped operation lookups - Update fetchAllPages collector to use OperationKey.fromOperation(op) - Change getOperationAndUpdateReplayState to accept parentId parameter - Refactor openPhasers map to Map<OperationKey, Phaser> - Update startPhaser to accept parentId parameter - Update BaseDurableOperation to pass null as parentId (temporary until parentId propagation is wired in task 3) - Update all test files to match new method signatures
- Add parentId field to DurableContext to track parent-child relationships - Pass parentId through all operation constructors (StepOperation, WaitOperation, InvokeOperation, CallbackOperation) - Update BaseDurableOperation to accept and store parentId parameter - Add convenience constructor in BaseDurableOperation for root-context operations where parentId is null - Update ExecutionManager calls to use parentId for scoped operation lookups instead of null - Add getContextId() method to DurableContext to retrieve parentId - Add getParentId() protected method to BaseDurableOperation for accessing parent context ID - Update operation update builder to include parentId instead of hardcoded null value - Enables proper operation tracking and isolation within child execution contexts
- Add isReplaying field to track per-context replay mode state - Initialize replay state based on cached operations in ExecutionManager - Add setExecutionMode() to transition context from replay to execution - Refactor DurableContext constructor to private shared initialization - Add createRootContext() static factory methods for root context creation - Add createChildContext() static factory method for child context creation - Add hasOperationsForContext() to ExecutionManager for replay state detection - Update DurableExecutor to use createRootContext() factory method - Update test fixtures to use new factory methods - Enables per-context replay tracking for improved operation handling
…nd unit tests - Add ChildContextOperation extending BaseDurableOperation with execute/get lifecycle - Handle first execution (START fire-and-forget, then run child context) - Handle replay: SUCCEEDED (cached result), FAILED (re-throw), STARTED (re-execute) - Implement large result handling (>=256KB) via ReplayChildren flow - Add ChildContextFailedException for non-reconstructable exception fallback - Add DurableContext.createChildContext() factory (no thread registration) - Add per-context replay state (isReplaying field) to DurableContext - Unit tests covering replay scenarios, failure preservation, and non-determinism detection
sdk/src/main/java/com/amazonaws/lambda/durable/DurableContext.java
Outdated
Show resolved
Hide resolved
sdk/src/main/java/com/amazonaws/lambda/durable/operation/CallbackOperation.java
Show resolved
Hide resolved
sdk/src/main/java/com/amazonaws/lambda/durable/operation/ChildContextOperation.java
Outdated
Show resolved
Hide resolved
| var contextId = getOperationId(); | ||
|
|
||
| // Register child context thread before executor runs (prevents suspension) | ||
| registerActiveThread(contextId, ThreadType.CONTEXT); |
There was a problem hiding this comment.
I think it's registered twice and also setCurrentContext is called twice here
There was a problem hiding this comment.
This is the same pattern as StepOperation.executeStepLogic. registerActiveThread runs on the parent thread to ensure the child is tracked before the parent can deregister (preventing a false "no active threads" suspension), while setCurrentContext sets the ThreadLocal on the actual child execution thread.
There was a problem hiding this comment.
I will add a more detailed comment for this
|
|
||
| // Register root context thread as active | ||
| /** Creates a root context with the given contextId and registers the current thread. */ | ||
| static DurableContext createRootContext( |
There was a problem hiding this comment.
Is this still a Root context if a contextId is specified?
There was a problem hiding this comment.
I can actually merge the two constructors to make it more concrete. Good call.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Issue Link, if available
#36
Description
Adds
runInChildContextandrunInChildContextAsyncto the Java Durable Execution SDK. Each child context gets its own operation counter and checkpoint log, enabling concurrent branches of work with per-context determinism.Implementation details:
ChildContextOperationextendsBaseDurableOperation<T>and manages the child context lifecycle: START (fire-and-forget), execute user function in a separate thread, SUCCEED/FAIL (blocking checkpoint).parentIdpropagated throughBaseDurableOperationto all operation subclasses, replacing the hardcodednull.isReplayingon eachDurableContextinstance, since a child may be replaying while the parent is already executing.-as separator (e.g.,"3-1","3-2"for operations inside parent"3"). This matches the JS SDK'sstepPrefixconvention and ensures global uniqueness — the backend validates type consistency by operation ID alone. Nested contexts chain naturally (e.g.,"3-2-1").ReplayChildrenflow — SUCCEED checkpoint with empty payload +ContextOptions { replayChildren: true }, reconstructed via re-execution on replay.ChildContextFailedExceptionfollows the same pattern asStepFailedException.Deferred:
CheckpointBatcher(preventing stale checkpoints from in-flight child operations after parent completes)summaryGeneratorfor large-result observabilitySee
docs/design-run-in-child-context.mdfor the full design.Demo/Screenshots
Checklist
Testing
Unit Tests
ChildContextOperationTest— covers first execution, replay SUCCEEDED, replay FAILED, replay STARTED, replayChildren path, non-deterministic detection, and error handling.DurableContextTest,ReplayValidationTest— updated forcreateRootContextfactory method.Integration Tests
ChildContextIntegrationTest— 15 different cases including waits and nesting.Examples
ChildContextExample— demonstrates three concurrent child contexts (two with step+wait, one with a nested child context), collected viaDurableFuture.allOf.CloudBasedIntegrationTest.