Skip to content

Commit 1aafe74

Browse files
committed
[WIP] Self hosted: Add lgalloc integration setup
1 parent dbe4273 commit 1aafe74

File tree

9 files changed

+680
-0
lines changed

9 files changed

+680
-0
lines changed

misc/nvme-bootstrap/README.md

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
# OpenEBS NVMe Bootstrap for Materialize
2+
3+
This guide helps you set up and configure NVMe instance store volumes for optimal Materialize performance on Kubernetes. The solution provides automatic detection and configuration of NVMe devices, making them available to Materialize through OpenEBS LVM storage classes.
4+
5+
> **WARNING:** This setup **automatically partitions and formats NVMe instance store volumes**. Make sure your nodes have NVMe storage (`r6gd.2xlarge`, `r7gd.2xlarge`), and verify backups before proceeding with the setup.
6+
7+
## Overview
8+
9+
Materialize requires fast, locally-attached NVMe storage for optimal performance. This solution:
10+
11+
1. Automatically detects NVMe instance store devices on your nodes
12+
2. Creates an LVM volume group from these devices
13+
3. Configures OpenEBS to provision persistent volumes from this storage
14+
4. Makes high-performance storage available to Materialize
15+
16+
## Prerequisites
17+
18+
- AWS account with permissions to create EC2 instances with NVMe storage
19+
- Kubernetes cluster with nodes that have NVMe instance store volumes
20+
- **Important**: You must use instance types with NVMe storage (those with the "d" suffix)
21+
- Recommended instance types: `r6gd.2xlarge`, `r7gd.2xlarge` (not `r8g.2xlarge` which lacks NVMe storage)
22+
- When using Bottlerocket OS, additional configuration is handled automatically
23+
- Tools required:
24+
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
25+
- [Helm](https://helm.sh/docs/intro/install/) (v3.2.0+)
26+
- [Docker](https://docs.docker.com/get-docker/) (for building the container)
27+
28+
## Automated Setup with Terraform
29+
30+
If you're using the [Materialize AWS Terraform module](https://github.com/MaterializeInc/terraform-aws-materialize), you can enable NVMe bootstrap by configuring:
31+
32+
```hcl
33+
module "materialize" {
34+
source = "git::https://github.com/MaterializeInc/terraform-aws-materialize.git"
35+
36+
# Use an instance type with NVMe storage
37+
node_group_instance_types = ["r6gd.2xlarge"]
38+
node_group_ami_type = "BOTTLEROCKET_ARM_64"
39+
40+
# Disable Materialize Operator installation as it will require the storage class
41+
# TODO: The module will support this configuration in the future
42+
install_materialize_operator = false
43+
44+
# Other module parameters...
45+
}
46+
```
47+
48+
The module handles creating the appropriate storage class and configuring Materialize to use it.
49+
50+
## Manual Setup
51+
52+
**TODO:** The following steps will be eventually automated in the Terraform modules for Materialize but can be done manually for now.
53+
54+
If you're setting up manually or need to customize the configuration, follow these steps:
55+
56+
### Step 1: Build and Push the Container Image
57+
58+
```bash
59+
# Clone the Materialize repository
60+
git clone https://github.com/MaterializeInc/materialize.git
61+
cd materialize
62+
63+
# Navigate to the container directory
64+
cd misc/nvme-bootstrap/container
65+
66+
# Build the image
67+
docker build -t your-registry/nvme-bootstrap:latest .
68+
69+
# Push to your registry
70+
docker push your-registry/nvme-bootstrap:latest
71+
```
72+
73+
### Step 2: Install OpenEBS
74+
75+
OpenEBS provides the CSI driver that interfaces with LVM to provide persistent storage:
76+
77+
```bash
78+
# Add the OpenEBS Helm repository
79+
helm repo add openebs https://openebs.github.io/charts
80+
helm repo update
81+
82+
# Create namespace for OpenEBS
83+
kubectl create namespace openebs
84+
85+
# Install OpenEBS with only the necessary components
86+
helm install openebs openebs/openebs \
87+
--namespace openebs \
88+
--set engines.replicated.mayastor.enabled=false
89+
```
90+
91+
Verify the installation:
92+
93+
```bash
94+
# Check if the LVM controller is running
95+
kubectl get pods -n openebs -l role=openebs-lvm
96+
```
97+
98+
### Step 3: Deploy the NVMe Bootstrap Components
99+
100+
```bash
101+
# Navigate to the Kubernetes manifests directory
102+
cd misc/nvme-bootstrap/kubernetes
103+
104+
# Apply RBAC resources for the bootstrap component
105+
kubectl apply -f rbac.yaml
106+
107+
# Deploy the DaemonSet (update the image reference if needed)
108+
kubectl apply -f daemonset.yaml
109+
110+
# Get the pod logs to monitor the setup
111+
kubectl logs -n kube-system -l app=nvme-disk-setup
112+
113+
# Wait for the pods to be ready
114+
kubectl -n kube-system wait --for=condition=Ready pods -l app=nvme-disk-setup --timeout=120s
115+
```
116+
117+
The DaemonSet will:
118+
119+
1. Run on all nodes in your cluster
120+
2. Detect available NVMe devices
121+
3. Create the "instance-store-vg" volume group
122+
4. Make the storage available for OpenEBS
123+
124+
### Step 4: Create and Test the Storage Class
125+
126+
```bash
127+
# Create the StorageClass
128+
kubectl apply -f storageclass.yaml
129+
130+
# Deploy a test PVC and Pod to verify functionality
131+
kubectl apply -f test-pvc.yaml
132+
133+
# Check if the PVC is bound
134+
kubectl get pvc test-lvm-pvc
135+
```
136+
137+
A successful test shows your storage class is working correctly.
138+
139+
To clean up the test resources:
140+
141+
```bash
142+
# Delete the test PVC and StorageClass
143+
kubectl delete -f test-pvc.yaml
144+
kubectl delete -f storageclass.yaml
145+
```
146+
147+
### Step 5: Configure Materialize to Use the Storage Class
148+
149+
When installing Materialize, provide the storage class configuration:
150+
151+
```bash
152+
# Create Helm values file
153+
cat > materialize-values.yaml << EOF
154+
storage:
155+
storageClass:
156+
create: true
157+
name: "openebs-lvm-instance-store-ext4"
158+
EOF
159+
160+
# Install Materialize with the storage configuration
161+
helm install my-materialize-operator materialize/materialize-operator \
162+
--namespace materialize \
163+
--create-namespace \
164+
--set observability.podMetrics.enabled=true \
165+
--values materialize-values.yaml
166+
```
167+
168+
This configures Materialize to use the NVMe-backed storage class for its persistent storage needs.
169+
170+
If you are doing this using the [Materialize Helm Terraform module](https://github.com/materializeInc/terraform-helm-materialize), you can set the `storageClass` field in the `materialize` module to `openebs-lvm-instance-store-ext4`.
171+
172+
```
173+
...
174+
storage = {
175+
storageClass = {
176+
create = true
177+
name = "openebs-lvm-instance-store-ext4"
178+
provisioner = "local.csi.openebs.io"
179+
parameters = {
180+
storage = "lvm"
181+
fsType = "ext4"
182+
volgroup = "instance-store-vg"
183+
}
184+
}
185+
}
186+
...
187+
```
188+
189+
## Verifying the Setup
190+
191+
To verify your NVMe bootstrap setup is working correctly:
192+
193+
```bash
194+
# Check the NVMe setup logs
195+
kubectl logs -n kube-system -l app=nvme-disk-setup
196+
197+
# Check that PVCs can be created with the storage class
198+
kubectl get pvc -A | grep openebs-lvm-instance-store-ext4
199+
```
200+
201+
## Troubleshooting
202+
203+
### Common Issues and Solutions
204+
205+
#### No NVMe Devices Found
206+
207+
**Symptom**: The bootstrap logs show "No suitable NVMe devices found"
208+
209+
**Solution**:
210+
- Verify you're using instance types with NVMe storage (with "d" suffix)
211+
- Check the instance type with:
212+
```bash
213+
kubectl debug node/$NODE_NAME -it --image=busybox -- cat /host/etc/ec2_instance_type
214+
```
215+
- If using AWS, ensure you're using types like r6gd.2xlarge, not r6g.2xlarge
216+
217+
#### Pod Fails to Create Storage
218+
219+
**Symptom**: LVM setup fails or PVCs remain in Pending status
220+
221+
**Solution**:
222+
- Check if OpenEBS components are running:
223+
```bash
224+
kubectl get pods -n openebs
225+
```
226+
- Verify the volume group exists:
227+
```bash
228+
kubectl debug node/$NODE_NAME -it --image=ubuntu -- vgs
229+
```
230+
- Check OpenEBS logs:
231+
```bash
232+
kubectl logs -n openebs -l role=openebs-lvm
233+
```
234+
235+
#### Permission Issues
236+
237+
**Symptom**: Permission denied errors in logs
238+
239+
**Solution**:
240+
- Verify RBAC resources are correctly applied:
241+
```bash
242+
kubectl get clusterrole node-taint-manager
243+
kubectl get clusterrolebinding nvme-setup-taint-binding
244+
```
245+
- Check the service account:
246+
```bash
247+
kubectl get serviceaccount nvme-setup-sa -n kube-system
248+
```
249+
250+
## Clean Up
251+
252+
When you're done testing:
253+
254+
```bash
255+
# Delete test resources
256+
kubectl delete -f test-pvc.yaml
257+
kubectl delete -f daemonset.yaml
258+
kubectl delete -f rbac.yaml
259+
kubectl delete -f storageclass.yaml
260+
261+
# Delete OpenEBS if no longer needed
262+
helm uninstall openebs -n openebs
263+
kubectl delete namespace openebs
264+
```
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Copyright Materialize, Inc. and contributors. All rights reserved.
2+
#
3+
# Use of this software is governed by the Business Source License
4+
# included in the LICENSE file at the root of this repository.
5+
#
6+
# As of the Change Date specified in that file, in accordance with
7+
# the Business Source License, use of this software will be governed
8+
# by the Apache License, Version 2.0.
9+
10+
FROM alpine:3.19
11+
12+
RUN apk add --no-cache \
13+
nvme-cli \
14+
lvm2 \
15+
lsblk \
16+
bash \
17+
jq \
18+
curl \
19+
kubectl
20+
21+
# LVM configuration file
22+
COPY lvm.conf /etc/lvm/lvm.conf
23+
# Disk configuration script
24+
COPY configure-disks.sh /usr/local/bin/configure-disks.sh
25+
# Taint management script
26+
COPY manage-taints.sh /usr/local/bin/manage-taints.sh
27+
28+
RUN chmod +x /usr/local/bin/configure-disks.sh /usr/local/bin/manage-taints.sh
29+
30+
ENTRYPOINT ["/usr/local/bin/configure-disks.sh"]

0 commit comments

Comments
 (0)