-
Notifications
You must be signed in to change notification settings - Fork 2
feat: add io weight control on image operator #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v1.7
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds IO weight control functionality to containerd's image operations (pull, unpack, commit) using cgroups v2. The feature allows prioritizing or deprioritizing IO operations through BFQ scheduler weights or io.weight configuration, enabling better resource management in multi-tenant or resource-constrained environments.
Key Changes:
- New
sys/blkiorunpackage implementing cgroups v2-based IO weight control with systemd slice management - Integration of IO weight control into image pull, unpack, and snapshot commit operations
- Configuration support for IO weight settings in containerd's config file
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| sys/blkiorun/config.go | Defines Config struct for IO weight configuration (10-1000 range) |
| sys/blkiorun/blkiorun_linux.go | Core Linux implementation with cgroup management, BFQ/io.weight conversion, and goroutine execution wrappers |
| sys/blkiorun/blkiorun_other.go | No-op implementation for non-Linux platforms |
| sys/blkiorun/blkiorun_test.go | Comprehensive test coverage for weight conversions and cgroup operations |
| services/server/config/config.go | Adds BlkioConfig struct to CgroupConfig for weight, slice path, and slice name configuration |
| cmd/containerd/command/main.go | Initializes blkiorun on containerd startup with configured settings |
| pull.go | Wraps pull operation with blkiorun.Go to execute in IO-weighted goroutine |
| pkg/unpack/unpacker.go | Wraps layer apply and fetch operations with blkiorun.Local for IO weight control |
| services/snapshots/service.go | Wraps snapshot commit with blkiorun.Go for IO weight control |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ch := make(chan string, 1) | ||
| _, err = conn.StartTransientUnitContext(ctx, name, "replace", props, ch) | ||
| if err != nil { | ||
| if strings.Contains(err.Error(), "already") || strings.Contains(err.Error(), "loaded") { |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function checks if the error message contains "already" or "loaded" to determine if the slice already exists, but these are substring matches on error messages which could be fragile across different systemd versions or locales. Consider checking for specific error types or codes if available in the systemd dbus library.
| if strings.Contains(err.Error(), "already") || strings.Contains(err.Error(), "loaded") { | |
| var dbusErr *dbus.Error | |
| if errors.As(err, &dbusErr) && dbusErr.Name == "org.freedesktop.systemd1.UnitExists" { |
| } | ||
|
|
||
| // Initialize block IO weight control (Linux only, no-op on other platforms) | ||
| blkioConfig := config.Cgroup.Blkio |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Weight field is typed as int in BlkioConfig but is being cast to uint16 here. This could cause issues if a negative value is provided in the configuration. Consider validating that blkioConfig.Weight is non-negative before casting, or change the BlkioConfig.Weight field type to uint16 to prevent negative values at the configuration level.
| blkioConfig := config.Cgroup.Blkio | |
| blkioConfig := config.Cgroup.Blkio | |
| if blkioConfig.Weight < 0 { | |
| return fmt.Errorf("invalid blkio weight %d: must be non-negative: %w", blkioConfig.Weight, errdefs.ErrInvalidArgument) | |
| } |
| func Go[T any](fn func() (T, error)) (T, error) { | ||
| return GoWithConfig(state.config, fn) |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Go function uses state.config without checking if state is nil first. While IsInitialized() is checked in LocalWithConfig, if Go is called directly when blkiorun is not initialized, this will cause a nil pointer dereference panic.
| // BlkioConfig provides block IO configuration for cgroups v2 | ||
| type BlkioConfig struct { | ||
| // Weight is the IO weight value (10-1000) for image pull, unpack, and commit operations. | ||
| // Set to 0 to disable IO weight control. Default is 0 (disabled). |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "Set to 0 to disable IO weight control" but doesn't mention that values outside the range 10-1000 will also result in the feature being disabled. Consider clarifying this behavior in the documentation or adding a note that invalid values will disable the feature with a warning.
| // Set to 0 to disable IO weight control. Default is 0 (disabled). | |
| // Set to 0 to disable IO weight control. Default is 0 (disabled). Any non-zero value | |
| // outside the valid range (10-1000) will be treated as invalid, causing IO weight | |
| // control to be disabled and a warning to be emitted. |
sys/blkiorun/blkiorun_linux.go
Outdated
| } | ||
|
|
||
| func (cg *cgroup) destroy() { | ||
| os.Remove(cg.path) |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The destroy method uses os.Remove which only logs errors internally. If the cgroup removal fails (e.g., due to permissions or processes still in the cgroup), there's no way to detect or handle this. While this might be acceptable for cleanup, consider whether failed removals could lead to resource accumulation over time, especially in high-churn scenarios.
| os.Remove(cg.path) | |
| if err := os.Remove(cg.path); err != nil && !os.IsNotExist(err) { | |
| log.G(context.Background()).WithError(err).WithField("cgroupPath", cg.path).Warn("failed to remove cgroup") | |
| } |
| bfqSupportedOnce.Do(func() { | ||
| bfqPath := filepath.Join(cgroupPath, "io.bfq.weight") | ||
| if _, err := os.Stat(bfqPath); err == nil { | ||
| bfqSupported = true | ||
| } | ||
| }) | ||
| return bfqSupported | ||
| } | ||
|
|
||
| // readIOWeight reads the current IO weight from cgroups. | ||
| // Returns BFQ weight if available, otherwise io.weight converted to BFQ range. |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isBFQSupported function uses sync.Once to set the bfqSupported global variable, but the cgroupPath parameter is passed in each call. If this function is called with different cgroupPath values, only the first path will be checked, and subsequent calls with different paths will return the cached result from the first call. This could lead to incorrect BFQ support detection if different cgroup paths have different BFQ availability.
| bfqSupportedOnce.Do(func() { | |
| bfqPath := filepath.Join(cgroupPath, "io.bfq.weight") | |
| if _, err := os.Stat(bfqPath); err == nil { | |
| bfqSupported = true | |
| } | |
| }) | |
| return bfqSupported | |
| } | |
| // readIOWeight reads the current IO weight from cgroups. | |
| // Returns BFQ weight if available, otherwise io.weight converted to BFQ range. | |
| bfqPath := filepath.Join(cgroupPath, "io.bfq.weight") | |
| if _, err := os.Stat(bfqPath); err == nil { | |
| return true | |
| } | |
| return false | |
| } | |
| // readIOWeight reads the current IO weight from cgroups. | |
| // Returns BFQ weight if available, otherwise io.weight converted to BFQ range. | |
| // Returns BFQ weight if available, otherwise io.weight converted to BFQ range. |
| _, err = blkiorun.Go(func() (struct{}, error) { | ||
| return struct{}{}, sn.Commit(ctx, cr.Name, cr.Key, opts...) | ||
| }) | ||
|
|
||
| if err != nil { | ||
| return nil, errdefs.ToGRPC(err) |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error from blkiorun.Go is assigned to err, but this shadows the original err variable. If the Commit operation succeeds but there's an error in the blkiorun.Go wrapper itself (which shouldn't happen since it's returning a struct{} and the actual error), this could lead to confusion. Consider using a different variable name or ensuring the error handling is clear.
| _, err = blkiorun.Go(func() (struct{}, error) { | |
| return struct{}{}, sn.Commit(ctx, cr.Name, cr.Key, opts...) | |
| }) | |
| if err != nil { | |
| return nil, errdefs.ToGRPC(err) | |
| _, commitErr := blkiorun.Go(func() (struct{}, error) { | |
| return struct{}{}, sn.Commit(ctx, cr.Name, cr.Key, opts...) | |
| }) | |
| if commitErr != nil { | |
| return nil, errdefs.ToGRPC(commitErr) |
| type BlkioConfig struct { | ||
| // Weight is the IO weight value (10-1000) for image pull, unpack, and commit operations. | ||
| // Set to 0 to disable IO weight control. Default is 0 (disabled). | ||
| Weight int `toml:"weight"` |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Weight field is defined as int which allows negative values. Since the valid range is 10-1000 (or 0 to disable), negative values don't make semantic sense. Consider changing the type to uint16 to match the internal Config type and prevent invalid negative values from being specified in configuration files.
| Weight int `toml:"weight"` | |
| Weight uint16 `toml:"weight"` |
| func Init(cfg Config, slicePath, sliceName string) error { | ||
| var initErr error | ||
|
|
||
| stateOnce.Do(func() { | ||
| s := &globalState{} | ||
| defer func() { state = s }() | ||
|
|
||
| if cfg.Weight == 0 { | ||
| log.L.Debug("blkiorun: disabled (weight=0)") | ||
| return | ||
| } | ||
|
|
||
| if cfg.Weight < BFQWeightMin || cfg.Weight > BFQWeightMax { | ||
| log.L.Warnf("blkiorun: weight %d out of range [%d, %d], disabled", cfg.Weight, BFQWeightMin, BFQWeightMax) | ||
| return | ||
| } | ||
|
|
||
| s.config = cfg | ||
| log.L.Infof("blkiorun: weight configured: %d", cfg.Weight) | ||
|
|
||
| if !isCgroupV2() { | ||
| log.L.Warn("blkiorun: cgroups v2 not available, disabled") | ||
| return | ||
| } | ||
|
|
||
| // Get containerd's cgroup path | ||
| var err error | ||
| s.containerdPath, err = getCurrentCgroupPath() | ||
| if err != nil { | ||
| initErr = fmt.Errorf("failed to get cgroup path: %w", err) | ||
| return | ||
| } | ||
| log.L.Debugf("blkiorun: containerd cgroup: %s", s.containerdPath) | ||
|
|
||
| var cgroupPath string | ||
| if slicePath != "" { | ||
| cgroupPath = slicePath | ||
| log.L.Debugf("blkiorun: using configured path: %s", cgroupPath) | ||
| } else { | ||
| // Create systemd slice | ||
| if sliceName == "" { | ||
| sliceName = DefaultSliceName | ||
| } | ||
| if !strings.HasSuffix(sliceName, ".slice") { | ||
| sliceName += ".slice" | ||
| } | ||
|
|
||
| ctx, cancel := context.WithTimeout(context.Background(), SystemdTimeout) | ||
| defer cancel() | ||
|
|
||
| if err := createSlice(ctx, sliceName); err != nil { | ||
| log.L.WithError(err).Warnf("blkiorun: failed to create slice %s", sliceName) | ||
| return | ||
| } | ||
|
|
||
| cgroupPath = sliceCgroupPath(sliceName) | ||
| for i := 0; i < sliceWaitRetries; i++ { | ||
| if _, err := os.Stat(cgroupPath); err == nil { | ||
| break | ||
| } | ||
| time.Sleep(sliceWaitInterval) | ||
| } | ||
| } | ||
|
|
||
| // Verify io.weight is available | ||
| if _, err := os.Stat(filepath.Join(cgroupPath, "io.weight")); os.IsNotExist(err) { | ||
| log.L.Warn("blkiorun: io.weight not available") | ||
| return | ||
| } | ||
|
|
||
| // Enable io controller for children | ||
| if err := enableIOController(cgroupPath); err != nil { | ||
| log.L.WithError(err).Warn("blkiorun: failed to enable io controller") | ||
| return | ||
| } | ||
|
|
||
| // Apply default IO weight to slice | ||
| if err := applyConfig(cgroupPath, cfg); err != nil { | ||
| log.L.WithError(err).Warn("blkiorun: failed to apply config") | ||
| return | ||
| } | ||
|
|
||
| s.slicePath = cgroupPath | ||
| s.initialized = true | ||
| log.L.Infof("blkiorun: initialized at %s with weight %d", cgroupPath, cfg.Weight) | ||
| }) | ||
|
|
||
| return initErr | ||
| } |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Init function has comprehensive logic for initialization, error handling, and validation, but there's no test coverage for it. Consider adding tests for Init with various configurations (valid weight, invalid weight, missing cgroups v2, etc.) to ensure proper initialization behavior and error handling.
| } | ||
|
|
||
| id := atomic.AddUint64(&counter, 1) | ||
| path := filepath.Join(state.slicePath, fmt.Sprintf("blkio-%d-%d", os.Getpid(), id)) |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cgroup path includes both PID and an atomic counter, which is good. However, if the same PID is reused after a process restart and the counter is reset, there's a potential (though unlikely) collision risk. Consider whether the counter should persist or if additional uniqueness factors should be included.
| path := filepath.Join(state.slicePath, fmt.Sprintf("blkio-%d-%d", os.Getpid(), id)) | |
| now := time.Now().UnixNano() | |
| path := filepath.Join(state.slicePath, fmt.Sprintf("blkio-%d-%d-%d", os.Getpid(), now, id)) |
Signed-off-by: zijiren233 <[email protected]>
2ce31a4 to
fcd4181
Compare
No description provided.