A feature-rich base image designed for GPU computing on Vast.ai. This image extends large, commonly-used base images to maximize Docker layer caching benefits, resulting in faster instance startup times.
Vast.ai host machines cache commonly-used Docker image layers. By building on top of large, popular base images like nvidia/cuda and rocm/dev-ubuntu, most of the image content is already present on host machines before you even start your instance.
How it works:
- Base images (NVIDIA CUDA, AMD ROCm, Ubuntu) are multi-gigabyte images commonly used across the platform
- These base layers are frequently cached on Vast.ai hosts
- Our image adds development tools, security features, and convenience utilities as additional layers
- When you start an instance, only the smaller top layers need to be downloaded
- Result: Fast startup times despite having a comprehensive development environment
Vast.ai's backend automatically selects the appropriate image variant based on the host machine's maximum supported CUDA version (determined by the installed NVIDIA driver). When you rent a machine:
- The system detects the host's maximum CUDA capability from its NVIDIA driver (e.g., supports up to CUDA 12.9)
- It finds the most recently pushed Docker image tag containing a compatible CUDA version
- It pulls that specific image variant
Example: A machine with drivers supporting CUDA 12.8 will pull the cuda-12.8.1-* variant, while a newer machine supporting CUDA 12.9 will pull cuda-12.9.1-*. This ensures you always get the best compatible version without manual configuration.
We build multiple variants to support different hardware and Python requirements:
| Type | Base Image | Ubuntu | Notes |
|---|---|---|---|
| Stock | ubuntu:22.04, ubuntu:24.04 |
22.04, 24.04 | No CUDA/ROCm libraries, but NVIDIA drivers are still loaded at runtime |
| NVIDIA CUDA | nvidia/cuda:*-cudnn-devel-ubuntu* |
22.04, 24.04 | Full CUDA toolkit + cuDNN (11.8, 12.1, 12.4, 12.6, 12.8, 12.9, 13.0.1, 13.0.2) |
| AMD ROCm | rocm/dev-ubuntu-*:6.2.4-complete |
22.04, 24.04 | Complete ROCm 6.2.4 development environment |
Note: Stock images can still access NVIDIA GPUs—they simply don't include the heavier CUDA development libraries. Use these when you want a lighter image and will install specific CUDA components yourself.
Each base image variant is available with Python 3.7 through 3.14. The default Python version matches the Ubuntu release:
- Ubuntu 22.04: Python 3.10
- Ubuntu 24.04: Python 3.12
Explicit tags for specific configurations:
vastai/base-image:cuda-12.8.1-cudnn-devel-ubuntu22.04-py310
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^
Base image identifier Python version
Tags without the Python suffix use the Ubuntu default (e.g., cuda-12.8.1-cudnn-devel-ubuntu22.04 uses Python 3.10).
Pre-built images are available on DockerHub.
| Category | Tools Included |
|---|---|
| Python | Miniforge/Conda, uv package manager, pre-configured /venv/main environment |
| Build Tools | build-essential, cmake, ninja-build, gdb, libssl-dev |
| Version Control | git, git-lfs |
| Node.js | NVM with latest LTS version |
| Editors | vim, nano |
| Shell Utilities | curl, wget, jq, rsync, rclone, zip/unzip, zstd |
| Feature | Description |
|---|---|
| CUDA | Full development toolkit with cuDNN (NVIDIA CUDA variants) |
| ROCm | Complete ROCm development environment (AMD variants) |
| OpenCL | Headers, ICD loaders, and runtime for both NVIDIA and AMD |
| Vulkan | Runtime and tools |
| NVIDIA Extras | OpenGL, video encode/decode libraries (auto-selected for driver compatibility) |
| Infiniband | rdma-core, libibverbs, infiniband-diags for high-speed networking |
| Tool | Purpose |
|---|---|
| htop | Interactive process viewer |
| nvtop | GPU process monitoring |
| iotop | I/O usage monitoring |
| Application | Description | Default Port |
|---|---|---|
| Jupyter | Interactive Python notebooks | 8080 |
| Syncthing | Peer-to-peer file synchronization | 8384 |
| Tensorboard | ML experiment visualization | 6006 |
| Cron | Task scheduling | - |
All applications are managed by Supervisor with configs in /etc/supervisor/conf.d/.
| Tool | Purpose |
|---|---|
| Vast CLI | Manage Vast.ai instances from the command line |
| magic-wormhole | Secure file transfer between machines |
| Syncthing | Keep files synchronized across instances |
| rclone | Cloud storage management |
- Non-root user:
useraccount (UID 1001) with passwordless sudo - Shared permissions: umask 002 for collaborative file access
- SSH key propagation: Keys automatically set up for both root and user accounts
The Instance Portal is a web-based dashboard for managing applications running on your instance. It provides secure access through TLS, authentication, and Cloudflare tunnels.
Access the portal by clicking "Open" on your instance card in the Vast.ai console. See the Instance Portal documentation for complete details.
The PORTAL_CONFIG environment variable defines which applications appear in the Instance Portal. Format:
hostname:external_port:internal_port:path:name|hostname:external_port:internal_port:path:name|...
| Field | Description |
|---|---|
hostname |
Usually localhost |
external_port |
Port exposed via -p flag (must be open in template) |
internal_port |
Port where your application listens |
path |
URL path for access (usually /) |
name |
Display name in the portal |
Example:
PORTAL_CONFIG="localhost:1111:11111:/:Instance Portal|localhost:8080:18080:/:Jupyter|localhost:7860:17860:/:My App"Port behavior:
- When
external_port≠internal_port: Caddy reverse proxy makes the app available on the external port with TLS and authentication - When
external_port=internal_port: The application bypasses proxying (direct access), but tunnel links are still created
The configuration is written to /etc/portal.yaml on first boot. You can edit this file at runtime and restart Caddy with supervisorctl restart caddy.
To enable HTTPS for all proxied applications, set:
ENABLE_HTTPS=trueWhen enabled:
- Caddy serves applications over HTTPS using certificates at
/etc/instance.crtand/etc/instance.key - Self-signed certificates are generated automatically during boot
- Install the Vast.ai Jupyter certificate locally to avoid browser warnings
Authentication is enabled by default for all proxied ports. Access methods:
- Open Button: Click "Open" on your instance card—automatically sets an auth cookie
- Basic Auth: Username
vastai, password is yourOPEN_BUTTON_TOKEN - Bearer Token: Include
Authorization: Bearer ${OPEN_BUTTON_TOKEN}header for API access
Related variables:
| Variable | Description |
|---|---|
ENABLE_AUTH |
Set to false to disable authentication (default: true) |
AUTH_EXCLUDE |
Comma-separated list of external ports to exclude from auth |
WEB_USERNAME |
Custom username for basic auth (default: vastai) |
WEB_PASSWORD |
Custom password (default: auto-generated or OPEN_BUTTON_TOKEN) |
The Instance Portal automatically creates Cloudflare tunnels for your applications, providing URLs like:
https://four-random-words.trycloudflare.com
For persistent custom domains, set CF_TUNNEL_TOKEN to your Cloudflare tunnel token. Note: Each running instance requires a separate tunnel token.
The default boot script (/opt/instance-tools/bin/boot_default.sh) accepts these arguments:
| Argument | Description |
|---|---|
--no-user-keys |
Skip SSH key propagation to the user account |
--no-export-env |
Don't export environment variables to /etc/environment |
--no-cert-gen |
Skip TLS certificate generation |
--no-update-portal |
Don't check for Instance Portal updates |
--no-update-vast |
Don't check for Vast CLI updates |
--no-activate-pyenv |
Don't activate Python environment in shell |
--sync-environment |
Sync Python/Conda environments to workspace volume for persistence |
--sync-home |
Sync home directories to workspace |
--jupyter-override |
Force Jupyter to start even in non-Jupyter launch modes |
Example (in Docker run command or template):
/opt/instance-tools/bin/entrypoint.sh --sync-environment --no-update-portal| Variable | Description |
|---|---|
BOOT_SCRIPT |
URL to a custom boot script that replaces the entire default startup routine |
HOTFIX_SCRIPT |
URL to a script that runs very early in boot, before most initialization—use to patch broken containers |
PROVISIONING_SCRIPT |
URL to a script that runs after Supervisor starts—use to install packages and configure applications |
SERVERLESS |
Set to true to skip update checks for faster cold starts |
Execution order:
BOOT_SCRIPT(if set, replaces everything below)HOTFIX_SCRIPT(runs first, can modify any part of startup)- Normal boot sequence (environment setup, workspace sync, TLS certs, Supervisor)
PROVISIONING_SCRIPT(runs after Supervisor, installs your customizations)
For derivative images, you can add custom scripts to /etc/vast_boot.d/ to hook into the boot sequence. Scripts are sourced in alphabetical order by filename, so use numeric prefixes to control ordering:
/etc/vast_boot.d/
├── 10-prep-env.sh
├── 25-first-boot.sh
├── 35-sync-home-dirs.sh
├── ...
├── 65-supervisor-launch.sh
├── 75-provisioning-script.sh
├── 80-my-custom-script.sh # Runs every boot
└── first_boot/
├── 05-update-vast.sh # Runs only on first boot
└── 20-my-first-boot.sh # Your first-boot script
- Every boot: Add scripts directly to
/etc/vast_boot.d/ - First boot only: Add scripts to
/etc/vast_boot.d/first_boot/
Key ordering points:
- Scripts before
65-*run before Supervisor starts - Scripts at
75-*or later run after Supervisor is running - First-boot scripts (in
first_boot/) are sourced at position25-*
For quick customizations without building a new image:
#!/bin/bash
set -eo pipefail
# Activate the main virtual environment
. /venv/main/bin/activate
# Install packages
pip install torch transformers
# Download models
hf download meta-llama/Llama-2-7b-hf --local-dir /workspace/models
# Add a new application to Supervisor (see "Adding Custom Applications" for details)
cat > /etc/supervisor/conf.d/my-app.conf << 'EOF'
[program:my-app]
environment=PROC_NAME="%(program_name)s"
command=/opt/supervisor-scripts/my-app.sh
autostart=true
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=true
EOF
# Reload Supervisor to pick up new config
supervisorctl reread && supervisorctl updateThe best way to create a custom image is to extend our pre-built images from DockerHub. This preserves the layer caching benefits—Vast.ai hosts already have our base layers cached, so only your custom layers need to be downloaded.
The contents of /opt/workspace-internal/ are copied to $WORKSPACE (default /workspace) on first boot. This happens because:
/workspacemay be a volume mount- Even if not mounted, copying moves content to the uppermost OverlayFS layer, enabling effective use of Vast's copy tools
For large models: Don't place large files directly in /opt/workspace-internal/. Instead, store them elsewhere (e.g., /models/) and create symlinks:
# Store large model outside workspace-internal
RUN hf download stabilityai/stable-diffusion-xl-base-1.0 --local-dir /models/sdxl-base
# Create symlink in workspace-internal
RUN mkdir -p /opt/workspace-internal/ComfyUI/models/checkpoints && \
ln -s /models/sdxl-base /opt/workspace-internal/ComfyUI/models/checkpoints/sdxl-baseThis avoids duplicating large files when they're copied to the workspace volume.
# Extend the pre-built image to preserve layer caching benefits
FROM vastai/base-image:cuda-12.8.1-cudnn-devel-ubuntu22.04
# Install Python packages into the main virtual environment
RUN . /venv/main/bin/activate && \
pip install torch torchvision torchaudio transformers accelerate
# Download a large model to a location outside workspace-internal
RUN . /venv/main/bin/activate && \
hf download stabilityai/stable-diffusion-xl-base-1.0 --local-dir /models/sdxl-base
# Create symlink so it appears in workspace
RUN mkdir -p /opt/workspace-internal/models && \
ln -s /models/sdxl-base /opt/workspace-internal/models/sdxl-base
# Add a custom application managed by Supervisor
COPY my-app.conf /etc/supervisor/conf.d/
COPY my-app.sh /opt/supervisor-scripts/
RUN chmod +x /opt/supervisor-scripts/my-app.sh
# Configure Instance Portal to include your app
ENV PORTAL_CONFIG="localhost:1111:11111:/:Instance Portal|localhost:8080:18080:/:Jupyter|localhost:7860:17860:/:My App"Supervisor manages all long-running applications. To add your own:
1. Create a Supervisor config (my-app.conf):
[program:my-app]
environment=PROC_NAME="%(program_name)s"
command=/opt/supervisor-scripts/my-app.sh
autostart=true
autorestart=true
# IMPORTANT: Log to stdout for Vast.ai logging integration
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=trueNote: Always configure
stdout_logfile=/dev/stdoutandredirect_stderr=true. If you log directly to files, output won't appear in Vast.ai's logging system.
2. Create a wrapper script (my-app.sh):
#!/bin/bash
# Import logging utilities for Portal log viewer
utils=/opt/supervisor-scripts/utils
. "${utils}/logging.sh"
. "${utils}/environment.sh"
# Activate the virtual environment
source /venv/main/bin/activate
# Run your application (bind to localhost, Caddy handles external access)
exec python /opt/my-app/main.py --host localhost --port 17860The logging utilities in /opt/supervisor-scripts/utils/ handle:
logging.sh- Tees output to/var/log/portal/${PROC_NAME}.logfor the Portal log viewerenvironment.sh- Sets up common environment variablesexit_portal.sh- Checks if the app is configured in portal.yaml before starting
| Path | Purpose |
|---|---|
/venv/main/ |
Primary Python virtual environment (Conda-managed) |
/workspace/ |
Persistent workspace directory |
/opt/workspace-internal/ |
Contents copied to /workspace on first boot |
/etc/supervisor/conf.d/ |
Supervisor service configurations |
/opt/supervisor-scripts/ |
Service wrapper scripts |
/opt/supervisor-scripts/utils/ |
Shared utilities for logging, environment setup |
/etc/portal.yaml |
Instance Portal configuration (generated from PORTAL_CONFIG) |
/var/log/portal/ |
Application logs (viewable in Instance Portal) |
/etc/instance.crt, /etc/instance.key |
TLS certificates |
| Variable | Description |
|---|---|
PORTAL_CONFIG |
Application configuration (see PORTAL_CONFIG) |
ENABLE_HTTPS |
Enable HTTPS for proxied applications (default: false) |
ENABLE_AUTH |
Enable authentication (default: true) |
AUTH_EXCLUDE |
Comma-separated ports to exclude from authentication |
WEB_USERNAME |
Basic auth username (default: vastai) |
WEB_PASSWORD |
Basic auth password (default: OPEN_BUTTON_TOKEN) |
CF_TUNNEL_TOKEN |
Cloudflare tunnel token for custom domains |
| Variable | Description |
|---|---|
BOOT_SCRIPT |
URL to custom boot script (replaces default startup) |
HOTFIX_SCRIPT |
URL to early-run patch script |
PROVISIONING_SCRIPT |
URL to post-Supervisor setup script |
SERVERLESS |
Set to true for faster cold starts |
| Variable | Description |
|---|---|
TENSORBOARD_LOG_DIR |
Tensorboard log directory (default: /workspace) |
Note: Building from source creates new image layers that won't be cached on Vast.ai hosts. For most use cases, extending our pre-built images is faster and more efficient.
If you need to modify the base image itself (not just add layers on top), you can build from source:
git clone https://github.com/vast-ai/base-image
cd base-image
# Build a specific variant
docker buildx build \
--build-arg BASE_IMAGE=nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04 \
--build-arg PYTHON_VERSION=3.11 \
-t my-base-image .
# Or use the build script for all variants
./build.sh --filter cuda-12.8 --dry-run # Preview what would be built
./build.sh --filter cuda-12.8 # Build CUDA 12.8 variants
./build.sh --list # Show all available configurationsSee LICENSE.md for details.