Skip to content

Conversation

@jrepp
Copy link
Owner

@jrepp jrepp commented Nov 20, 2025

User request: "look at all local branches for unmerged commits, create PRs if they are found by first merging origin/main and submitting the commit data"

This branch contains 9 unmerged commit(s). Conflicts resolved automatically with aggressive strategy.

Co-Authored-By: Claude [email protected]

jrepp and others added 9 commits October 16, 2025 07:44
User request: "prism-proxy needs to add drain functionality - when told to stop by the prism-admin it needs to go into a state where it is draining connections. first it should drain and then deny new front end connections, then it should wait for all backend work to be complete that was attached to front end work - the stop signal to the runners should also follow a similar style of shutdown, they should be told that the system is stopping, they will continue to process current requests but will not accept new requests, when all pending work is completed they should exit, when the proxy front end connections have exited and all pattern processes have exited the proxy should cleanly exit"

Implemented comprehensive drain-on-shutdown behavior with coordinated proxy and pattern lifecycle:

**Protobuf Changes**:
- Extended lifecycle.proto with DrainRequest/DrainResponse messages
- Added Drain RPC to LifecycleInterface service
- Updated proxy_control_plane.proto to include DrainRequest in ProxyCommand

**Proxy Implementation** (Rust):
- Added DrainState enum (Running, Draining, Stopping) to ProxyServer
- Implemented drain_and_shutdown() with 5-phase sequence:
  1. Enter drain mode (reject new connections)
  2. Signal pattern runners to drain
  3. Wait for frontend connections to complete (with timeout)
  4. Stop pattern runners
  5. Shutdown gRPC server
- Added active_connections atomic counter for connection tracking
- Updated main.rs to use drain_and_shutdown with 30s default timeout

**Pattern Manager** (Rust):
- Added drain_all_patterns() and stop_all_patterns() methods
- Implemented drain_pattern() to signal individual patterns
- Added PatternClient::drain() for gRPC drain calls

**Go Plugin SDK**:
- Extended Plugin interface with Drain() method
- Added DrainMetrics struct (drained_operations, aborted_operations)
- Implemented Drain RPC in LifecycleService
- Updated memstore and redis drivers with Drain() implementations

**Pattern Runner**:
- Added Drain() to KeyValuePluginAdapter
- Delegates drain to underlying backend drivers
- Proper error handling and logging

**Router Changes**:
- Made pattern_manager field public for drain coordination

**Key Features**:
- Zero data loss: All in-flight operations complete before shutdown
- Configurable timeouts: Default 30s, environment variable override
- Clear state transitions with logged indicators
- Kubernetes-ready for graceful rolling updates

All binaries compile successfully. Ready for local binary testing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Brings in GitHub merge queue support (#1) from main branch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…Drain()

User request: "address the priority actions"

Fixed three categories of CI failures:

1. **Rust Formatting** (cargo fmt):
   - Fixed drain_all_patterns() function signature formatting
   - Fixed drain_pattern() method signature formatting
   - Fixed stop_all_patterns().await chain formatting
   - All formatting now passes cargo fmt --check

2. **Documentation Validation** (ADR-058 frontmatter):
   - Fixed id: 058 → adr-058 (lowercase format required)
   - Fixed status: accepted → Accepted (capitalization)
   - Added missing required fields: deciders, project_id, doc_uuid
   - Changed tags from array to YAML list format
   - All 126 documents now validate successfully

3. **NATS Driver Drain Implementation**:
   - Added Drain() method to NATSPattern
   - Delegates to built-in NATS connection drain during Stop()
   - Returns DrainMetrics with zero operations (handled by library)
   - NATS driver tests now pass

Unit test results:
- ✅ pkg/drivers/nats: All tests pass
- ✅ patterns/consumer: All tests pass
- ✅ patterns/producer: All tests pass

Documentation validation:
- ✅ 126 documents scanned
- ✅ 381 links valid
- ✅ 0 errors

Next: Investigate acceptance test failures in CI

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
…ule dependency

- Add Drain() method to S3Driver to implement plugin.Plugin interface
- Add replace directive for redis driver in producer pattern go.mod to use local version instead of cached remote version
- Both changes fix compilation errors in acceptance tests and pattern tests

Fixes the following CI test failures:
- Test Producer Pattern
- Test KeyValue Acceptance
- Test Consumer Acceptance
- Test Producer Acceptance
- Test ClaimCheck Acceptance
- Test Unified Acceptance

User request: "fix issues with jrepp/drain-on-shutdown so the pr passes"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Resolved conflicts in generated documentation files (docs/404.html, docs/index.html, docs/sitemap.xml) by accepting deletion from main branch.

Brings in latest changes from main including:
- Mailbox pattern implementation (RFC-037)
- prism-admin SQLite storage (ADR-054)
- Control plane improvements
- New SQLite driver

User request: "resolve conflicts to merge this PR"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Add Drain() method to MemPubSub driver (pkg/drivers/memstore/pubsub.go)
- Add Drain() method to SQLite driver (pkg/drivers/sqlite/sqlite.go)
- Add replace directive for memstore in mailbox pattern go.mod

These drivers were added in main and needed the Drain() method to implement
the updated plugin.Plugin interface. All follow the same pattern as other
synchronous drivers (return empty DrainMetrics since operations complete immediately).

User request: "resolve conflicts to merge this PR"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Pattern Adapters:
- ConsumerPluginAdapter: Allow in-flight message processing to complete
- MulticastRegistryPluginAdapter: Allow atomic operations to complete

Drivers:
- PostgresPlugin: Connection pool drains automatically before shutdown
- KafkaPlugin: Flush pending producer messages with timeout, track aborted ops

All implementations now complete the plugin.Plugin interface with proper
drain semantics to support graceful shutdown.

Fixes CI test failures:
- TestConsumerProcessBased/ConsumerProcess-NATS-MemStore
- TestConsumerProcessBased/ConsumerProcess-NATS-MemStore-DLQ
- Any tests using Postgres or Kafka drivers

User request: "resolve conflicts to merge this PR"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Comprehensive guide for local Kubernetes setup with k3d as recommended modern installer.

Key content:
- Comparison of k3d vs kind/minikube/Docker Desktop/k0s
- Complete prerequisites and installation
- Quick start multi-node cluster creation
- Full Prism architecture deployment (7 phases)
- Backing services (Redis, NATS, Postgres, MinIO, Kafka)
- Prism components (admin, proxy, pattern runners)
- Ingress configuration with Nginx
- Observability stack (Prometheus/Grafana)
- Resource limits and HPA
- Troubleshooting guide
- CI/CD integration examples

User request: "write a memo of how to make a local docker kubernetes installation with the service components, what is the best modern 'installer' for k8s?"

Co-Authored-By: Claude <[email protected]>
Comprehensive ADR proposing Kubernetes Operator with CRDs for declarative, flexible Prism cluster management.

Core features:
- PrismCluster CRD (v1alpha1) defines entire stack in single YAML
- Controller reconciliation with 7-phase loop (backends → admin → proxy → patterns → ingress → status)
- Runtime flexibility: add/remove patterns, scale components, enable backends via kubectl patch
- Self-healing with automatic recreation of failed components
- Automatic dependency management (Redis before keyvalue-runner, NATS before consumer-runner)
- Service discovery with automatic ConfigMap generation
- Rich status reporting (phase, component health, conditions)

Technology:
- Kubebuilder framework (vs Operator SDK)
- 8-week implementation plan
- Complete Go controller code examples
- Comparison with Helm, Manual YAML (MEMO-035), Kustomize

MEMO-035 updated:
- Added "Alternative: Kubernetes Operator Deployment" section
- Operator vs Manual YAML comparison table
- Quick operator example with runtime flexibility demos
- When to use operator vs manual YAML guidance
- Migration path from manual YAML to operator

Documentation validated with all 130 docs passing.

User request: "i want this installation to be flexible at install time and runtime, would it be possible to use a kind of CRD and a controller to orchestrate the components?"

Co-Authored-By: Claude <[email protected]>
Copilot AI review requested due to automatic review settings November 20, 2025 22:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds documentation for Go tooling and CLI utilities through several Architecture Decision Records (ADRs). The PR establishes Go as the language of choice for Prism's developer tooling, CLI utilities, and data migration tools, complementing the existing Rust proxy and Python orchestration components.

Key Changes:

  • Established Go as the standard language for CLI tools and utilities (ADR-012)
  • Defined comprehensive error handling, concurrency, testing, configuration management, and logging patterns for Go code (ADRs 013-017)

Reviewed Changes

Copilot reviewed 17 out of 1125 changed files in this pull request and generated no comments.

Show a summary per file
File Description
docs/adr/adr-012/index.html Documents the decision to use Go for tooling and CLI utilities
docs/adr/adr-013/index.html Defines Go error handling strategy using wrapped errors
docs/adr/adr-014/index.html Establishes Go concurrency patterns using fork-join model
docs/adr/adr-015/index.html Outlines Go testing strategy with three-tier approach
docs/adr/adr-016/index.html Specifies CLI and configuration management using Cobra and Viper

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mergify mergify bot added documentation Improvements or additions to documentation rust go Pull requests that update go code labels Nov 20, 2025
@mergify
Copy link

mergify bot commented Nov 20, 2025

This PR is very large (XL). Consider breaking it into smaller, more reviewable PRs.

@mergify mergify bot added the size/xl label Nov 20, 2025
@mergify
Copy link

mergify bot commented Nov 20, 2025

This PR has merge conflicts with the base branch. Please resolve them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation go Pull requests that update go code has-conflicts rust size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants