This repository contains a comprehensive tutorial for deploying and managing AI inference workloads on Kubernetes clusters with AMD GPUs.
- Ubuntu/Debian server with AMD GPUs
- Root/sudo access
- At least 2GB RAM and 20GB free disk space
- Internet connectivity for package downloads
Before starting the installation, run the enhanced system check to ensure your system is ready and resolve any potential issues:
sudo ./check-system-enhanced.shThis educational script will:
- Auto-detect and resolve update manager lock issues (common problem)
- Explain what each check does and why it's important
- Verify system requirements (OS, disk space, memory, network)
- Detect container environment vs bare metal
- Check Kubernetes installation status
- Provide learning recommendations and next steps
sudo ./install-kubernetes.shThis script will:
- Install vanilla Kubernetes 1.28+ on Ubuntu/Debian
- Configure containerd container runtime
- Set up Calico CNI networking
- Configure single-node cluster (removes control-plane taints)
- Disable swap and configure kernel settings
- Verify cluster functionality
./install-amd-gpu-operator.shThis script will:
- Install Helm (if not present)
- Install cert-manager (prerequisite)
- Install AMD GPU Operator
- Configure device settings for vanilla Kubernetes
- Set up persistent storage for AI models
- Verify the installation
./deploy-vllm-inference.shThis script will:
- Install MetalLB load balancer
- Deploy vLLM inference server with Llama-3.2-1B model
- Create LoadBalancer service for external access
- Generate test scripts for API validation
┌─────────────────────────────────────────────┐
│ Applications │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ vLLM │ │ Jupyter │ ... │
│ │ Inference │ │ Notebooks │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Kubernetes Layer │
│ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │ Pods │ │ Services │ │Deployments│ │
│ └──────────┘ └──────────┘ └───────────┘ │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ AMD GPU Operator │
│ ┌──────────────┐ ┌─────────────────┐ │
│ │Device Plugin │ │ Node Labeller │ │
│ └──────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Hardware Infrastructure │
│ ┌─────────────────────────────────────┐ │
│ │ AMD Instinct MI300X GPUs │ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │GPU 0│ │GPU 1│ │GPU 2│ │GPU 3│ │ │
│ │ └─────┘ └─────┘ └─────┘ └─────┘ │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────┘