Skip to content

Commit 5bdb79d

Browse files
committed
Bump minor version.
1 parent face26c commit 5bdb79d

File tree

7 files changed

+255
-7
lines changed

7 files changed

+255
-7
lines changed

context/getting-started.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -109,25 +109,36 @@ This will start:
109109

110110
### Adding Health Monitors
111111

112-
You can add monitors to detect and respond to unhealthy conditions. For example, to add a memory monitor:
112+
You can add monitors to observe worker health and automatically respond to issues. Monitors are useful for:
113+
114+
- **Memory leak detection**: Automatically restart workers consuming excessive memory.
115+
- **Performance monitoring**: Track CPU and memory usage trends.
116+
- **Capacity planning**: Understand resource requirements.
117+
118+
For example, to add monitoring:
113119

114120
```ruby
115121
service "supervisor" do
116122
include Async::Container::Supervisor::Environment
117123

118124
monitors do
119125
[
120-
# Restart workers that exceed 500MB of memory:
126+
# Log process metrics for observability:
127+
Async::Container::Supervisor::ProcessMonitor.new(
128+
interval: 60
129+
),
130+
131+
# Restart workers exceeding memory limits:
121132
Async::Container::Supervisor::MemoryMonitor.new(
122-
interval: 10, # Check every 10 seconds
123-
maximum_size_limit: 1024 * 1024 * 500 # 500MB limit
133+
interval: 10,
134+
maximum_size_limit: 1024 * 1024 * 500 # 500MB limit per process
124135
)
125136
]
126137
end
127138
end
128139
```
129140

130-
The {ruby Async::Container::Supervisor::MemoryMonitor} will periodically check worker memory usage and restart any workers that exceed the configured limit.
141+
See the {ruby Async::Container::Supervisor::MemoryMonitor Memory Monitor} and {ruby Async::Container::Supervisor::ProcessMonitor Process Monitor} guides for detailed configuration options and best practices.
131142

132143
### Collecting Diagnostics
133144

context/index.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,11 @@ files:
1010
title: Getting Started
1111
description: This guide explains how to get started with `async-container-supervisor`
1212
to supervise and monitor worker processes in your Ruby applications.
13+
- path: memory-monitor.md
14+
title: Memory Monitor
15+
description: This guide explains how to use the <code class="language-ruby">Async::Container::Supervisor::MemoryMonitor</code>
16+
to detect and restart workers that exceed memory limits or develop memory leaks.
17+
- path: process-monitor.md
18+
title: Process Monitor
19+
description: This guide explains how to use the <code class="language-ruby">Async::Container::Supervisor::ProcessMonitor</code>
20+
to log CPU and memory metrics for your worker processes.

context/memory-monitor.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Memory Monitor
2+
3+
This guide explains how to use the {ruby Async::Container::Supervisor::MemoryMonitor} to detect and restart workers that exceed memory limits or develop memory leaks.
4+
5+
## Overview
6+
7+
Long-running worker processes often accumulate memory over time, either through legitimate growth or memory leaks. Without intervention, workers can consume all available system memory, causing performance degradation or system crashes. The `MemoryMonitor` solves this by automatically detecting and restarting problematic workers before they impact system stability.
8+
9+
Use the `MemoryMonitor` when you need:
10+
11+
- **Memory leak protection**: Automatically restart workers that continuously accumulate memory.
12+
- **Resource limits**: Enforce maximum memory usage per worker.
13+
- **System stability**: Prevent runaway processes from exhausting system memory.
14+
- **Leak diagnosis**: Capture memory samples when leaks are detected for debugging.
15+
16+
The monitor uses the `memory-leak` gem to track process memory usage over time, detecting abnormal growth patterns that indicate leaks.
17+
18+
## Usage
19+
20+
Add a memory monitor to your supervisor service to automatically restart workers that exceed 500MB:
21+
22+
```ruby
23+
service "supervisor" do
24+
include Async::Container::Supervisor::Environment
25+
26+
monitors do
27+
[
28+
Async::Container::Supervisor::MemoryMonitor.new(
29+
# Check worker memory every 10 seconds:
30+
interval: 10,
31+
32+
# Restart workers exceeding 500MB:
33+
maximum_size_limit: 1024 * 1024 * 500
34+
)
35+
]
36+
end
37+
end
38+
```
39+
40+
When a worker exceeds the limit:
41+
1. The monitor logs the leak detection.
42+
2. Optionally captures a memory sample for debugging.
43+
3. Sends `SIGINT` to gracefully shut down the worker.
44+
4. The container automatically spawns a replacement worker.
45+
46+
## Configuration Options
47+
48+
The `MemoryMonitor` accepts the following options:
49+
50+
### `interval`
51+
52+
The interval (in seconds) at which to check for memory leaks. Default: `10` seconds.
53+
54+
```ruby
55+
Async::Container::Supervisor::MemoryMonitor.new(interval: 30)
56+
```
57+
58+
### `maximum_size_limit`
59+
60+
The maximum memory size (in bytes) per process. When a process exceeds this limit, it will be restarted.
61+
62+
```ruby
63+
# 500MB limit
64+
Async::Container::Supervisor::MemoryMonitor.new(maximum_size_limit: 1024 * 1024 * 500)
65+
66+
# 1GB limit
67+
Async::Container::Supervisor::MemoryMonitor.new(maximum_size_limit: 1024 * 1024 * 1024)
68+
```
69+
70+
### `total_size_limit`
71+
72+
The total size limit (in bytes) for all monitored processes combined. If not specified, only per-process limits are enforced.
73+
74+
```ruby
75+
# Total limit of 2GB across all workers
76+
Async::Container::Supervisor::MemoryMonitor.new(
77+
maximum_size_limit: 1024 * 1024 * 500, # 500MB per process
78+
total_size_limit: 1024 * 1024 * 1024 * 2 # 2GB total
79+
)
80+
```
81+
82+
### `memory_sample`
83+
84+
Options for capturing memory samples when a leak is detected. If `nil`, memory sampling is disabled.
85+
86+
Default: `{duration: 30, timeout: 120}`
87+
88+
```ruby
89+
# Customize memory sampling:
90+
Async::Container::Supervisor::MemoryMonitor.new(
91+
memory_sample: {
92+
duration: 60, # Sample for 60 seconds
93+
timeout: 180 # Timeout after 180 seconds
94+
}
95+
)
96+
97+
# Disable memory sampling:
98+
Async::Container::Supervisor::MemoryMonitor.new(
99+
memory_sample: nil
100+
)
101+
```
102+
103+
## Memory Leak Detection
104+
105+
When a memory leak is detected, the monitor will:
106+
107+
1. Log the leak detection with process details.
108+
2. If `memory_sample` is configured, capture a memory sample from the worker.
109+
3. Send a `SIGINT` signal to gracefully restart the worker.
110+
4. The container will automatically restart the worker process.
111+
112+
### Memory Sampling
113+
114+
When a memory leak is detected and `memory_sample` is configured, the monitor requests a lightweight memory sample from the worker. This sample:
115+
116+
- Tracks allocations during the sampling period.
117+
- Forces a garbage collection.
118+
- Returns a JSON report showing retained objects.
119+
120+
The report includes:
121+
- `total_allocated`: Total allocated memory and object count.
122+
- `total_retained`: Total retained memory and count after GC.
123+
- `by_gem`: Breakdown by gem/library.
124+
- `by_file`: Breakdown by source file.
125+
- `by_location`: Breakdown by specific file:line locations.
126+
- `by_class`: Breakdown by object class.
127+
- `strings`: String allocation analysis.
128+
129+
This is much more efficient than a full heap dump using `ObjectSpace.dump_all`.

context/process-monitor.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# Process Monitor
2+
3+
This guide explains how to use the {ruby Async::Container::Supervisor::ProcessMonitor} to log CPU and memory metrics for your worker processes.
4+
5+
## Overview
6+
7+
Understanding how your workers consume resources over time is essential for performance optimization, capacity planning, and debugging. Without visibility into CPU and memory usage, you can't identify bottlenecks, plan infrastructure scaling, or diagnose production issues effectively.
8+
9+
The `ProcessMonitor` provides this observability by periodically capturing and logging comprehensive metrics for your entire application process tree.
10+
11+
Use the `ProcessMonitor` when you need:
12+
13+
- **Performance analysis**: Identify which workers consume the most CPU or memory.
14+
- **Capacity planning**: Determine optimal worker counts and memory requirements.
15+
- **Trend monitoring**: Track resource usage patterns over time.
16+
- **Debugging assistance**: Correlate resource usage with application behavior.
17+
- **Cost optimization**: Right-size infrastructure based on actual usage.
18+
19+
Unlike the {ruby Async::Container::Supervisor::MemoryMonitor}, which takes action when limits are exceeded, the `ProcessMonitor` is purely observational - it logs metrics without interfering with worker processes.
20+
21+
## Usage
22+
23+
Add a process monitor to log resource usage every minute:
24+
25+
```ruby
26+
service "supervisor" do
27+
include Async::Container::Supervisor::Environment
28+
29+
monitors do
30+
[
31+
# Log CPU and memory metrics for all processes:
32+
Async::Container::Supervisor::ProcessMonitor.new(
33+
interval: 60 # Capture metrics every minute
34+
)
35+
]
36+
end
37+
end
38+
```
39+
40+
This allows you to easily search and filter by specific fields:
41+
- `general.process_id = 12347` - Find metrics for a specific process.
42+
- `general.command = "worker-1"` - Find all metrics for worker processes.
43+
- `general.processor_utilization > 50` - Find high CPU usage processes.
44+
- `general.resident_size > 500000` - Find processes using more than 500MB.
45+
46+
## Configuration Options
47+
48+
### `interval`
49+
50+
The interval (in seconds) at which to capture and log process metrics. Default: `60` seconds.
51+
52+
```ruby
53+
# Log every 30 seconds
54+
Async::Container::Supervisor::ProcessMonitor.new(interval: 30)
55+
56+
# Log every 5 minutes
57+
Async::Container::Supervisor::ProcessMonitor.new(interval: 300)
58+
```
59+
60+
## Captured Metrics
61+
62+
The `ProcessMonitor` captures the following metrics for each process:
63+
64+
### Core Metrics
65+
66+
- **process_id**: Unique identifier for the process.
67+
- **parent_process_id**: The parent process that spawned this one.
68+
- **process_group_id**: Process group identifier.
69+
- **command**: The command name.
70+
- **processor_utilization**: CPU usage percentage.
71+
- **resident_size**: Physical memory used (KB).
72+
- **total_size**: Total memory space including shared memory (KB).
73+
- **processor_time**: Total CPU time used (seconds).
74+
- **elapsed_time**: How long the process has been running (seconds).
75+
76+
### Detailed Memory Metrics
77+
78+
When available (OS-dependent), additional memory details are captured:
79+
80+
- **map_count**: Number of memory mappings (stacks, libraries, etc.).
81+
- **proportional_size**: Memory usage accounting for shared memory (KB).
82+
- **shared_clean_size**: Unmodified shared memory (KB).
83+
- **shared_dirty_size**: Modified shared memory (KB).
84+
- **private_clean_size**: Unmodified private memory (KB).
85+
- **private_dirty_size**: Modified private memory (KB).
86+
- **referenced_size**: Active page-cache (KB).
87+
- **anonymous_size**: Memory not backed by files (KB)
88+
- **swap_size**: Memory swapped to disk (KB).
89+
- **proportional_swap_size**: Proportional swap usage (KB).
90+
- **major_faults**: The number of page faults requiring I/O.
91+
- **minor_faults**: The number of page faults that don't require I/O (e.g. CoW).

lib/async/container/supervisor/version.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ module Async
99
module Container
1010
# @namespace
1111
module Supervisor
12-
VERSION = "0.7.0"
12+
VERSION = "0.8.0"
1313
end
1414
end
1515
end

readme.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,19 @@ Please see the [project documentation](https://socketry.github.io/async-containe
1818

1919
- [Getting Started](https://socketry.github.io/async-container-supervisor/guides/getting-started/index) - This guide explains how to get started with `async-container-supervisor` to supervise and monitor worker processes in your Ruby applications.
2020

21+
- [Memory Monitor](https://socketry.github.io/async-container-supervisor/guides/memory-monitor/index) - This guide explains how to use the <code class="language-ruby">Async::Container::Supervisor::MemoryMonitor</code> to detect and restart workers that exceed memory limits or develop memory leaks.
22+
23+
- [Process Monitor](https://socketry.github.io/async-container-supervisor/guides/process-monitor/index) - This guide explains how to use the <code class="language-ruby">Async::Container::Supervisor::ProcessMonitor</code> to log CPU and memory metrics for your worker processes.
24+
2125
## Releases
2226

2327
Please see the [project releases](https://socketry.github.io/async-container-supervisor/releases/index) for all releases.
2428

29+
### v0.8.0
30+
31+
- Add `Async::Container::Supervisor::ProcessMonitor` for logging CPU and memory metrics periodically.
32+
- Fix documentation to use correct `maximum_size_limit:` parameter name for `MemoryMonitor` (was incorrectly documented as `limit:`).
33+
2534
### v0.7.0
2635

2736
- If a memory leak is detected, sample memory usage for 60 seconds before exiting.

releases.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Releases
22

3-
## Unreleased
3+
## v0.8.0
44

55
- Add `Async::Container::Supervisor::ProcessMonitor` for logging CPU and memory metrics periodically.
66
- Fix documentation to use correct `maximum_size_limit:` parameter name for `MemoryMonitor` (was incorrectly documented as `limit:`).

0 commit comments

Comments
 (0)