Skip to content

Commit 3abbd56

Browse files
committed
Add startup taint removal functionality to azurelustre CSI driver
The taint key follows the pattern: {driverName}/agent-not-ready e.g., azurelustre.csi.azure.com/agent-not-ready This allows users to apply startup taints to prevent scheduling before the CSI driver is ready, addressing potential race conditions during node startup.
1 parent ed6d89e commit 3abbd56

File tree

10 files changed

+497
-8
lines changed

10 files changed

+497
-8
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![Coverage Status](https://coveralls.io/repos/github/kubernetes-sigs/azurelustre-csi-driver/badge.svg?branch=main)](https://coveralls.io/github/kubernetes-sigs/azurelustre-csi-driver?branch=main)
44
[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fkubernetes-sigs%2Fazurelustre-csi-driver.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2Fkubernetes-sigs%2Fazurelustre-csi-driver?ref=badge_shield)
55

6-
### About
6+
## About
77

88
This driver allows Kubernetes to access Azure Lustre file system.
99

@@ -12,7 +12,7 @@ This driver allows Kubernetes to access Azure Lustre file system.
1212

1313
 
1414

15-
### Container Images & Kubernetes Compatibility:
15+
### Container Images & Kubernetes Compatibility
1616

1717
| Driver version | Image | Supported k8s version | Lustre client version |
1818
|-----------------|-----------------------------------------------------------------|-----------------------|-----------------------|

deploy/rbac-csi-azurelustre-node.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,9 @@ rules:
1414
- apiGroups: [""]
1515
resources: ["secrets"]
1616
verbs: ["get", "list"]
17+
- apiGroups: [""]
18+
resources: ["nodes"]
19+
verbs: ["get", "patch"]
1720

1821
---
1922
kind: ClusterRoleBinding

docs/csi-debug.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -448,6 +448,75 @@ Check for solutions in [Resolving Common Errors](errors.md)
448448
449449
---
450450
451+
## Pod Scheduling and Node Readiness Issues
452+
453+
### Pods Stuck in Pending Status with Taint-Related Errors
454+
455+
**Symptoms:**
456+
457+
- Pods requiring Azure Lustre storage remain in `Pending` status
458+
- Pod events show taint-related scheduling failures
459+
- Error messages mentioning `azurelustre.csi.azure.com/agent-not-ready` taint
460+
461+
**Check pod scheduling status:**
462+
463+
```sh
464+
kubectl describe pod <pod-name>
465+
```
466+
467+
Look for events such as:
468+
469+
- `Warning FailedScheduling ... node(s) had taint {azurelustre.csi.azure.com/agent-not-ready: }, that the pod didn't tolerate`
470+
- `0/X nodes are available: X node(s) had taint {azurelustre.csi.azure.com/agent-not-ready}`
471+
472+
**Check node taints:**
473+
474+
```sh
475+
kubectl describe nodes | grep -A5 -B5 "azurelustre.csi.azure.com/agent-not-ready"
476+
```
477+
478+
**Check CSI driver readiness on nodes:**
479+
480+
```sh
481+
# Check if CSI driver pods are running on all nodes
482+
kubectl get pods -n kube-system -l app=csi-azurelustre-node -o wide
483+
484+
# Check CSI driver logs for startup issues
485+
kubectl logs -n kube-system -l app=csi-azurelustre-node -c azurelustre --tail=100 | grep -i "taint\|ready\|error"
486+
```
487+
488+
**Common causes and solutions:**
489+
490+
1. **CSI Driver Still Starting**: Wait for CSI driver pods to reach `Running` status
491+
492+
```sh
493+
kubectl wait --for=condition=ready pod -l app=csi-azurelustre-node -n kube-system --timeout=300s
494+
```
495+
496+
2. **Lustre Module Loading Issues**: Check if Lustre kernel modules are properly loaded
497+
498+
```sh
499+
kubectl exec -n kube-system <csi-azurelustre-node-pod> -c azurelustre -- lsmod | grep lustre
500+
```
501+
502+
3. **Manual Taint Removal** (Emergency only - not recommended for production):
503+
504+
```sh
505+
kubectl taint nodes <node-name> azurelustre.csi.azure.com/agent-not-ready:NoSchedule-
506+
```
507+
508+
**Verify taint removal functionality:**
509+
510+
Check that startup taint removal is enabled in the CSI driver:
511+
512+
```sh
513+
kubectl logs -n kube-system -l app=csi-azurelustre-node -c azurelustre | grep -i "remove.*taint"
514+
```
515+
516+
Expected log output should show taint removal activity when the driver becomes ready.
517+
518+
---
519+
451520
## Get Azure Lustre Driver Version
452521
453522
```sh

docs/driver-parameters.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,29 @@ These are the parameters to be passed into the custom StorageClass that users mu
44

55
For more information, see the [Azure Managed Lustre Filesystem (AMLFS) service documentation](https://learn.microsoft.com/en-us/azure/azure-managed-lustre/) and the [AMLFS CSI documentation](https://learn.microsoft.com/en-us/azure/azure-managed-lustre/use-csi-driver-kubernetes).
66

7+
## CSI Driver Configuration Parameters
8+
9+
These parameters control the behavior of the Azure Lustre CSI driver itself and are typically configured during driver installation rather than in StorageClass definitions.
10+
11+
### Node Startup Taint Management
12+
13+
Name | Meaning | Available Value | Default Value | Configuration Method
14+
--- | --- | --- | --- | ---
15+
remove-not-ready-taint | Controls whether the CSI driver automatically removes startup taints from nodes when the driver becomes ready. This ensures pods are only scheduled to nodes where the CSI driver is fully operational and Lustre filesystem capacity is available. Nodes should have a taint of the form: `azurelustre.csi.azure.com/agent-not-ready:NoSchedule` | `true`, `false` | `true` | Command-line flag `--remove-not-ready-taint` in driver deployment
16+
17+
#### Startup Taint Details
18+
19+
When enabled (default), the Azure Lustre CSI driver will:
20+
21+
1. **Monitor Node Readiness**: Check if the CSI driver is fully initialized on the node
22+
2. **Remove Blocking Taint**: Automatically remove the `azurelustre.csi.azure.com/agent-not-ready:NoSchedule` taint when ready
23+
24+
This mechanism prevents pods requiring Azure Lustre storage from being scheduled to nodes where:
25+
26+
- Lustre kernel modules are not yet loaded
27+
- CSI driver components are not fully initialized
28+
- Network connectivity to Lustre filesystems is not established
29+
730
## Dynamic Provisioning (Create an AMLFS Cluster through AKS) - Public Preview
831

932
> **Public Preview Notice**: Dynamic provisioning functionality is currently in public preview. Some features may not be supported or may have constrained capabilities.

docs/errors.md

Lines changed: 85 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ This document describes common errors that can occur during volume creation and
1111
- [Error: Resource not found](#error-resource-not-found)
1212
- [Error: Cannot create AMLFS cluster, not enough IP addresses available](#error-cannot-create-amlfs-cluster-not-enough-ip-addresses-available)
1313
- [Error: Reached Azure Subscription Quota Limit for AMLFS Clusters](#error-reached-azure-subscription-quota-limit-for-amlfs-clusters)
14+
- [Pod Scheduling Errors](#pod-scheduling-errors)
15+
- [Node Readiness and Taint Errors](#node-readiness-and-taint-errors)
16+
- [Error: Node had taint azurelustre.csi.azure.com/agent-not-ready](#error-node-had-taint-azurelustrecsiazurecomagent-not-ready)
1417
- [Volume Mounting Errors](#volume-mounting-errors)
1518
- [Node Mount Errors](#node-mount-errors)
1619
- [Error: Could not mount target](#error-could-not-mount-target)
@@ -31,7 +34,7 @@ This document describes common errors that can occur during volume creation and
3134
- [Controller Logs](#controller-logs)
3235
- [Node Logs](#node-logs)
3336
- [Comprehensive Log Collection](#comprehensive-log-collection)
34-
37+
3538
---
3639

3740
## Volume Creation Errors
@@ -211,6 +214,87 @@ There is not enough room in the /subscriptions/<sub-id>/resourceGroups/<rg>/prov
211214

212215
---
213216

217+
## Pod Scheduling Errors
218+
219+
### Node Readiness and Taint Errors
220+
221+
#### Error: Node had taint azurelustre.csi.azure.com/agent-not-ready
222+
223+
**Symptoms:**
224+
225+
- Pods requiring Azure Lustre storage remain stuck in `Pending` status
226+
- Pod events show taint-related scheduling failures:
227+
- `Warning FailedScheduling ... node(s) had taint {azurelustre.csi.azure.com/agent-not-ready: }, that the pod didn't tolerate`
228+
- `0/X nodes are available: X node(s) had taint {azurelustre.csi.azure.com/agent-not-ready}`
229+
- Kubectl describe pod shows scheduling failures due to taints
230+
231+
**Possible Causes:**
232+
233+
- CSI driver is still initializing on nodes
234+
- Lustre kernel modules are not yet loaded
235+
- CSI driver failed to start properly on affected nodes
236+
- Node is not ready to handle Azure Lustre volume allocations
237+
- CSI driver startup taint removal is disabled
238+
239+
**Debugging Steps:**
240+
241+
```bash
242+
# Check pod scheduling status
243+
kubectl describe pod <pod-name> | grep -A10 Events
244+
245+
# Check which nodes have the taint
246+
kubectl describe nodes | grep -A5 -B5 "azurelustre.csi.azure.com/agent-not-ready"
247+
248+
# Verify CSI driver pod status on nodes
249+
kubectl get pods -n kube-system -l app=csi-azurelustre-node -o wide
250+
251+
# Check CSI driver startup logs
252+
kubectl logs -n kube-system -l app=csi-azurelustre-node -c azurelustre --tail=100 | grep -i "taint\|ready\|error"
253+
254+
# Verify taint removal is enabled (should be true by default)
255+
kubectl logs -n kube-system -l app=csi-azurelustre-node -c azurelustre | grep -i "remove.*taint"
256+
```
257+
258+
**Resolution:**
259+
260+
1. **Wait for CSI Driver Readiness** (most common case):
261+
262+
```bash
263+
# Wait for CSI driver pods to reach Running status
264+
kubectl wait --for=condition=ready pod -l app=csi-azurelustre-node -n kube-system --timeout=300s
265+
```
266+
267+
The taint should be automatically removed once the CSI driver is fully operational.
268+
269+
2. **Check Lustre Module Loading**:
270+
271+
```bash
272+
# Verify Lustre modules are loaded on nodes
273+
kubectl exec -n kube-system <csi-azurelustre-node-pod> -c azurelustre -- lsmod | grep lustre
274+
```
275+
276+
3. **Verify CSI Driver Configuration**:
277+
278+
```bash
279+
# Check if taint removal is enabled (default: true)
280+
kubectl get deployment csi-azurelustre-node -n kube-system -o yaml | grep "remove-not-ready-taint"
281+
```
282+
283+
4. **Emergency Manual Taint Removal** (not recommended for production):
284+
285+
```bash
286+
# Only use if CSI driver is confirmed working but taint persists
287+
kubectl taint nodes <node-name> azurelustre.csi.azure.com/agent-not-ready:NoSchedule-
288+
```
289+
290+
**Prevention:**
291+
292+
- Ensure CSI driver has sufficient time to initialize during cluster updates
293+
- Monitor CSI driver health during node scaling operations
294+
- Use pod disruption budgets to prevent scheduling issues during maintenance
295+
296+
---
297+
214298
## Volume Mounting Errors
215299

216300
### Node Mount Errors

docs/install-csi-driver.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,3 +121,23 @@ The CSI driver deployment includes automated **exec-based readiness probes** for
121121

122122
**Important**: The enhanced validation ensures the driver reports ready only when LNet is fully functional for Lustre operations. Wait for all CSI driver node pods to pass enhanced readiness checks before creating PersistentVolumes or mounting Lustre filesystems.
123123

124+
## Startup Taints
125+
126+
When the CSI driver starts on each node, it automatically removes the following taint if present:
127+
128+
- **Taint Key**: `azurelustre.csi.azure.com/agent-not-ready`
129+
- **Taint Effect**: `NoSchedule`
130+
131+
This ensures that:
132+
133+
1. **Node Readiness**: Pods requiring Azure Lustre storage are only scheduled to nodes where the CSI driver is fully initialized
134+
2. **Lustre Client Ready**: The node has successfully loaded Lustre kernel modules and networking components
135+
136+
### Configuring Startup Taint Behavior
137+
138+
The startup taint functionality is enabled by default but can be configured during installation:
139+
140+
- **Default Behavior**: Startup taint removal is **enabled** by default
141+
- **Disable Taint Removal**: To disable, set `--remove-not-ready-taint=false` in the driver deployment
142+
143+
For most AKS users, the default behavior provides optimal pod scheduling and should not be changed

go.mod

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,9 @@ require (
1919
golang.org/x/net v0.44.0
2020
google.golang.org/grpc v1.75.1
2121
google.golang.org/protobuf v1.36.9
22+
k8s.io/api v0.31.13
2223
k8s.io/apimachinery v0.31.13
24+
k8s.io/client-go v1.5.2
2325
k8s.io/klog/v2 v2.130.1
2426
k8s.io/kubernetes v1.31.13
2527
k8s.io/mount-utils v0.31.6
@@ -122,10 +124,8 @@ require (
122124
gopkg.in/inf.v0 v0.9.1 // indirect
123125
gopkg.in/yaml.v2 v2.4.0 // indirect
124126
gopkg.in/yaml.v3 v3.0.1 // indirect
125-
k8s.io/api v0.31.13 // indirect
126127
k8s.io/apiextensions-apiserver v0.31.1 // indirect
127128
k8s.io/apiserver v0.31.13 // indirect
128-
k8s.io/client-go v1.5.2 // indirect
129129
k8s.io/cloud-provider v0.31.1 // indirect
130130
k8s.io/component-base v0.31.13 // indirect
131131
k8s.io/component-helpers v0.31.13 // indirect

0 commit comments

Comments
 (0)