Skip to content

feat(sdk): checkpoint batching#104

Merged
yaythomas merged 1 commit intomainfrom
checkpoint-batching
Oct 28, 2025
Merged

feat(sdk): checkpoint batching#104
yaythomas merged 1 commit intomainfrom
checkpoint-batching

Conversation

@yaythomas
Copy link
Contributor

@yaythomas yaythomas commented Oct 27, 2025

Add batching mechanism to reduce checkpoint frequency and improve execution performance by grouping multiple checkpoint operations before persisting state changes.

Closes #48

  • Add batch size configuration to control checkpoint frequency
  • Implement batching logic in state management layer
  • Update execution flow to support deferred checkpointing
  • Asynchronous checkpoints (is_sync=False, batched for performance):
    • Step START with AtLeastOncePerRetry semantics
    • Child context START operations
    • Wait for condition START operations
  • Synchronous checkpoints (is_sync=True, immediate persistence):
    • Step START with AtMostOncePerRetry semantics
    • Operation completion (SUCCEED/FAIL)
    • Callback START operations
    • Invoke START operations
    • Large result checkpoints
  • Add batch flush on critical operations and execution completion
  • Implement parent-child tracking and orphan detection for parallel operations
  • Maintain backward compatibility with existing checkpoint behavior
  • Ensure state consistency across batch boundaries

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@yaythomas yaythomas requested review from a user and ParidelPooya October 27, 2025 22:33
@yaythomas yaythomas force-pushed the checkpoint-batching branch from 0a03f6f to 26a7f1b Compare October 27, 2025 22:34
@yaythomas yaythomas changed the title feat: checkpoint batching feat(sdk): checkpoint batching Oct 27, 2025
@yaythomas yaythomas closed this Oct 27, 2025
@yaythomas yaythomas reopened this Oct 27, 2025
Add batching mechanism to reduce checkpoint frequency and improve
execution performance by grouping multiple checkpoint operations
before persisting state changes.

- Add batch size configuration to control checkpoint frequency
- Implement batching logic in state management layer
- Update execution flow to support deferred checkpointing
- Asynchronous checkpoints (is_sync=False, batched for performance):
  - Step START with AtLeastOncePerRetry semantics
  - Child context START operations
  - Wait for condition START operations
- Synchronous checkpoints (is_sync=True, immediate persistence):
  - Step START with AtMostOncePerRetry semantics
  - Operation completion (SUCCEED/FAIL)
  - Callback START operations
  - Invoke START operations
  - Large result checkpoints
- Add batch flush on critical operations and execution completion
- Implement parent-child tracking and orphan detection for parallel
  operations
- Maintain backward compatibility with existing checkpoint behavior
- Ensure state consistency across batch boundaries
@yaythomas yaythomas force-pushed the checkpoint-batching branch from 26a7f1b to 0cf3c80 Compare October 28, 2025 09:05
@yaythomas yaythomas merged commit 48a1c41 into main Oct 28, 2025
8 checks passed
@yaythomas yaythomas deleted the checkpoint-batching branch October 28, 2025 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Batch Checkpoints

1 participant