Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(nodegroup): new integration: spot ocean #6731

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions docs/proposal-009-spot-ocean.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Spot ocean integration

## Authors

Spot By NetApp Ocean (@spotinst/sig-developers)

## Status

In process.

## Table of Contents
<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Linked Docs](#linked-docs)
- [Proposal](#proposal)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Alternatives](#alternatives)
<!-- /toc -->

## Summary

We implemented Spot Ocean structures that are based on the eksctl Cluster and NodeGroup structures from release `0.146.0`. This implementation
allows spot-ocean users to utilize eksctl in various ways on their clusters and node groups.
We note that no dependencies exist between the spot-ocean and eksctl structures that could create problematic issues in the future.

The value in integrating Spot Ocean with `eksctl` is simply to bring existing and future AWS customers a way of:

a) Creating new clusters and/or node groups with spot ocean integration using a
single command.

b) Modifying clusters and/or node groups with spot ocean integration using a
single command.

Spot by Netapp pledges to fully maintain this integration.
This includes:
- Monthly updates with new features
- Code reviews and feature assessment from the direct EKSCTL community
- Feature parity with our direct API and UI enabling EKSCTL all the latest features
- Spot by Netapp fully managing Support and maintenance of this integration
- Bug fixes directly from the EKSCTL community
- Urgent 24/7 support available on our platform
- Ensuring full compatibility with the newest versions of Kubernetes and EKS

## Motivation

The overall motivation of this proposal is to solve 2 problems:

- There are many AWS customers with eks clusters, with a demand for spot ocean integration.
- AWS Customers want to integrate their eks clusters and nodegroups with spot ocean via eksctl's configuration.

### Goals

- Enable AWS users to create spot ocean clusters and nodegroups using eksctl.
- Enable AWS users to modify their spot ocean cluster configs and nodegroups using eksctl.
- Enable AWS users to perform utility actions on their spot ocean clusters and nodegroups using eksctl.

### Linked Docs

[Original PR](https://github.com/weaveworks/eksctl/pull/6731).
[Spot Ocean docs](../userdocs/src/usage/spot).
[Expansion issue](https://github.com/weaveworks/eksctl/issues/6694).

## Proposal

This design proposes adding a new field `spotOcean` to both cluster and nodegroup level,
and creates cluster with spot ocean managed nodegroups.

for example:

```bash
eksctl create cluster \
--name example \
--spot-ocean
--managed=false
```

will result in a new spot ocean cluster.
In addition, the design proposes 2 new utils options `update-spot-ocean-cluster` and `update-spot-ocean-credentials`.

for example:
```bash
eksctl utils update-spot-ocean-cluster -v 4 -f ./cluster.yaml
```

while the `cluster.yaml` contains the new updated cluster definition.

## Design Details

The new arg option `--spot-ocean` will be added to `eksctl create cluster` and `eksctl create nodegroup`. That option will also be supported in the ClusterConfig file for self-managed nodegroups.
In addition, we have added 2 new options for utils actions of eksctl, `update-spot-ocean-cluster` and `update-spot-ocean-credentials`, both require a configuration file, mainly meant for update action regarding the cluster.
- For more details feel free to browse our [spot ocean guides](../userdocs/src/usage/spot/ocean/spot-ocean-cluster.md)

### Test Plan

Following maintenance or the release of a new feature, we check the following:

- Running all the existing unit tests to make sure nothing broke from our changes.
- Creation of new eks clusters on the various up to date k8s versions.
- Creation and modification of nodegroups inside said clusters.
- Verification of utility actions concerning ocean cluster management within eksctl.

## Alternatives

The current alternative is use of our own branch forked from the main eksctl branch [repo](https://github.com/spotinst/weaveworks-eksctl/releases/tag/v0.146.0) for customer purposes.
110 changes: 110 additions & 0 deletions examples/38-spot-ocean.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# An example of ClusterConfig object with Spot Ocean nodegroups.
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
name: cluster-22
region: us-west-2

spotOcean:
strategy:
utilizeReservedInstances: true
fallbackToOnDemand: true

scheduling:
shutdownHours:
isEnabled: true
timeWindows:
- Mon:22:00-Tue:06:00
- Tue:22:00-Wed:06:00
- Wed:22:00-Thu:06:00
- Thu:22:00-Fri:06:00
- Fri:22:00-Mon:06:00

tasks:
- isEnabled: true
taskType: manualHeadroomUpdate
cronExpression: 0 1 * * *
config:
headrooms:
- cpuPerUnit: 2000
memoryPerUnit: 4000
gpuPerUnit: 0
numOfUnits: 1
- isEnabled: true
taskType: manualHeadroomUpdate
cronExpression: 0 2 * * *
config:
headrooms:
- cpuPerUnit: 0
memoryPerUnit: 200
gpuPerUnit: 0
numOfUnits: 2

autoScaler:
enabled: true
cooldown: 300
autoConfig: false
headrooms:
cpuPerUnit: 2
gpuPerUnit: 0
memoryPerUnit: 64
numOfUnits: 1

compute:
instanceTypes:
whitelist:
- t3a.large
- t3a.xlarge
- t3a.2xlarge
- m5a.large
- m5a.xlarge
- m5a.2xlarge
- m5a.4xlarge
- c5.large
- c5.xlarge
- c5.2xlarge
- c5.4xlarge

nodeGroups:
- name: ocean-ng1
spotOcean: {}

- name: ocean-ng2
spotOcean:
strategy:
spotPercentage: 100

compute:
instanceTypes:
- t3a.large
- t3a.xlarge
- t3a.2xlarge

autoScaler:
headrooms:
- cpuPerUnit: 2
gpuPerUnit: 0
memoryPerUnit: 32
numOfUnits: 1

- name: ocean-ng3
spotOcean:
strategy:
spotPercentage: 70

compute:
instanceTypes:
- m5a.large
- m5a.xlarge
- m5a.2xlarge
- m5a.4xlarge
- c5.large
- c5.xlarge
- c5.2xlarge
- c5.4xlarge

autoScaler:
resourceLimits:
maxInstanceCount: 10
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -442,7 +442,7 @@ require (
github.com/spf13/cast v1.5.0 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/spf13/viper v1.15.0 // indirect
github.com/spotinst/spotinst-sdk-go v1.129.0 // indirect
github.com/spotinst/spotinst-sdk-go v1.149.0 // indirect
github.com/ssgreg/nlreturn/v2 v2.2.1 // indirect
github.com/stbenjam/no-sprintf-host-port v0.1.1 // indirect
github.com/stretchr/objx v0.5.0 // indirect
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1811,6 +1811,8 @@ github.com/spf13/viper v1.15.0 h1:js3yy885G8xwJa6iOISGFwd+qlUo5AvyXb7CiihdtiU=
github.com/spf13/viper v1.15.0/go.mod h1:fFcTBJxvhhzSJiZy8n+PeW6t8l+KeT/uTARa0jHOQLA=
github.com/spotinst/spotinst-sdk-go v1.129.0 h1:1HuySAZ0LuBTmPWGa2I1c6CHx8j+mnrf7B475F2Ub9o=
github.com/spotinst/spotinst-sdk-go v1.129.0/go.mod h1:C6mrT7+mqOgPyabacjyYTvilu8Xm96mvTvrZQhj99WI=
github.com/spotinst/spotinst-sdk-go v1.149.0 h1:mg5srf81kTy7mqPJDm8epWDopOnTqP66j4X9I3o4OxE=
github.com/spotinst/spotinst-sdk-go v1.149.0/go.mod h1:Ku9c4p+kRWnQqmXkzGcTMHLcQKgLHrQZISxeKY7mPqE=
github.com/src-d/gcfg v1.4.0/go.mod h1:p/UMsR43ujA89BJY9duynAwIpvqEujIH/jFlfL7jWoI=
github.com/ssgreg/nlreturn/v2 v2.2.1 h1:X4XDI7jstt3ySqGU86YGAURbxw3oTDPK9sPEi6YEwQ0=
github.com/ssgreg/nlreturn/v2 v2.2.1/go.mod h1:E/iiPB78hV7Szg2YfRgyIrk1AD6JVMTRkkxBiELzh2I=
Expand Down
17 changes: 17 additions & 0 deletions pkg/actions/cluster/delete.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ import (
"github.com/weaveworks/eksctl/pkg/elb"
"github.com/weaveworks/eksctl/pkg/fargate"
"github.com/weaveworks/eksctl/pkg/kubernetes"
"github.com/weaveworks/eksctl/pkg/spot"
ssh "github.com/weaveworks/eksctl/pkg/ssh/client"
"github.com/weaveworks/eksctl/pkg/utils/apierrors"
"github.com/weaveworks/eksctl/pkg/utils/kubeconfig"
Expand Down Expand Up @@ -65,6 +66,22 @@ func deleteSharedResources(ctx context.Context, cfg *api.ClusterConfig, ctl *eks
return err
}
}

// Spot Ocean.
{
// List all nodegroup stacks.
stacks, err := stackManager.ListNodeGroupStacks(ctx)
if err != nil {
return err
}

// Execute pre-delete actions.
if err := spot.RunPreDelete(ctx, ctl.AWSProvider, cfg, cfg.NodeGroups,
stacks, spot.NewAlwaysFilter(), false, 0, false); err != nil {
return err
}
}

return nil
}

Expand Down
35 changes: 25 additions & 10 deletions pkg/actions/nodegroup/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import (
"fmt"
"io"

"github.com/aws/amazon-ec2-instance-selector/v2/pkg/selector"
"github.com/kris-nova/logger"
"github.com/pkg/errors"

Expand Down Expand Up @@ -232,19 +233,33 @@ func (m *Manager) nodeCreationTasks(ctx context.Context, isOwnedCluster bool) er
vpcImporter = vpc.NewSpecConfigImporter(*m.ctl.Status.ClusterInfo.Cluster.ResourcesVpcConfig.ClusterSecurityGroupId, cfg.VPC)
}

allNodeGroupTasks := &tasks.TaskTree{
Parallel: true,
}
nodeGroupTasks := m.stackManager.NewUnmanagedNodeGroupTask(ctx, cfg.NodeGroups, !awsNodeUsesIRSA, vpcImporter)
if nodeGroupTasks.Len() > 0 {
allNodeGroupTasks.Append(nodeGroupTasks)
nodeGroupTasks, err := m.stackManager.NewNodeGroupTask(ctx, cfg.NodeGroups, cfg.ManagedNodeGroups, !awsNodeUsesIRSA, vpcImporter)
if err != nil {
return fmt.Errorf("failed to create nodegroup tasks: %v", err)
}
managedTasks := m.stackManager.NewManagedNodeGroupTask(ctx, cfg.ManagedNodeGroups, !awsNodeUsesIRSA, vpcImporter)
if managedTasks.Len() > 0 {
allNodeGroupTasks.Append(managedTasks)

// Spot Ocean.
{
for _, ng := range cfg.NodeGroups {
if ng.Name != api.SpotOceanClusterNodeGroupName {
continue
}

logger.Debug("ocean: normalizing cluster nodegroup")

instanceSelector, err := selector.New(ctx, m.ctl.AWSProvider.AWSConfig())
if err != nil {
return fmt.Errorf("ocean: failed to create instance selector: %v", err)
}

svc := eks.NewNodeGroupService(m.ctl.AWSProvider, instanceSelector, nil)
if err := svc.Normalize(ctx, []api.NodePool{ng}, cfg); err != nil {
return fmt.Errorf("ocean: failed to normalize cluster nodegroup: %v", err)
}
}
}

taskTree.Append(allNodeGroupTasks)
taskTree.Append(nodeGroupTasks)
return eks.DoAllNodegroupStackTasks(taskTree, meta.Region, meta.Name)
}

Expand Down
Loading