Skip to content

Feature: Graceful shutdown on unrecoverable worker errors #922

@mensfeld

Description

@mensfeld

Summary

Allow Shoryuken to gracefully shut down when workers encounter unrecoverable errors, enabling container orchestrators (ECS, Kubernetes) to restart the service.

Use Case

Common scenario with AWS RDS credential rotation:

  • Database password rotates every 7 days via AWS Secrets Manager
  • Running ECS tasks have stale credentials in environment
  • ActiveRecord::NoDatabaseError is raised and caught by retry middleware
  • Messages keep getting retried unnecessarily
  • Desired behavior: Shoryuken shuts down, ECS restarts container with fresh env

Proposed Solution

  1. Add configuration for "fatal" or "unrecoverable" error classes
  2. When these errors occur, trigger graceful shutdown instead of retry
  3. Similar to how Sidekiq handles Sidekiq::Shutdown
Shoryuken.configure_server do |config|
  config.fatal_errors = [
    ActiveRecord::NoDatabaseError,
    Aws::SQS::Errors::InvalidClientTokenId
  ]
end

Original Issue

This was originally requested in #773 and closed by the stale bot without resolution.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions