Vast.ai Base Docker Image

A feature-rich base image designed for GPU computing on Vast.ai. This image extends large, commonly-used base images to maximize Docker layer caching benefits, resulting in faster instance startup times.

Why This Image?

Optimized for Fast Startup Through Layer Caching

Vast.ai host machines cache commonly-used Docker image layers. By building on top of large, popular base images like nvidia/cuda and rocm/dev-ubuntu, most of the image content is already present on host machines before you even start your instance.

How it works:

Base images (NVIDIA CUDA, AMD ROCm, Ubuntu) are multi-gigabyte images commonly used across the platform
These base layers are frequently cached on Vast.ai hosts
Our image adds development tools, security features, and convenience utilities as additional layers
When you start an instance, only the smaller top layers need to be downloaded
Result: Fast startup times despite having a comprehensive development environment

Automatic CUDA Version Selection

Vast.ai's backend automatically selects the appropriate image variant based on the host machine's maximum supported CUDA version (determined by the installed NVIDIA driver). When you rent a machine:

The system detects the host's maximum CUDA capability from its NVIDIA driver (e.g., supports up to CUDA 12.9)
It finds the most recently pushed Docker image tag containing a compatible CUDA version
It pulls that specific image variant

Example: A machine with drivers supporting CUDA 12.8 will pull the cuda-12.8.1-* variant, while a newer machine supporting CUDA 12.9 will pull cuda-12.9.1-*. This ensures you always get the best compatible version without manual configuration.

Available Image Variants

We build multiple variants to support different hardware and Python requirements:

Base Images

Type	Base Image	Ubuntu	Notes
Stock	`ubuntu:22.04`, `ubuntu:24.04`	22.04, 24.04	No CUDA/ROCm libraries, but NVIDIA drivers are still loaded at runtime
NVIDIA CUDA	`nvidia/cuda:-cudnn-devel-ubuntu`	22.04, 24.04	Full CUDA toolkit + cuDNN (11.8, 12.1, 12.4, 12.6, 12.8, 12.9, 13.0.1, 13.0.2)
AMD ROCm	`rocm/dev-ubuntu-*:6.2.4-complete`	22.04, 24.04	Complete ROCm 6.2.4 development environment

Note: Stock images can still access NVIDIA GPUs—they simply don't include the heavier CUDA development libraries. Use these when you want a lighter image and will install specific CUDA components yourself.

Python Versions

Each base image variant is available with Python 3.7 through 3.14. The default Python version matches the Ubuntu release:

Ubuntu 22.04: Python 3.10
Ubuntu 24.04: Python 3.12

Tag Format

Explicit tags for specific configurations:

vastai/base-image:cuda-12.8.1-cudnn-devel-ubuntu22.04-py310
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^
                  Base image identifier              Python version

Tags without the Python suffix use the Ubuntu default (e.g., cuda-12.8.1-cudnn-devel-ubuntu22.04 uses Python 3.10).

Pre-built images are available on DockerHub.

Features

Development Environment

Category	Tools Included
Python	Miniforge/Conda, uv package manager, pre-configured `/venv/main` environment
Build Tools	build-essential, cmake, ninja-build, gdb, libssl-dev
Version Control	git, git-lfs
Node.js	NVM with latest LTS version
Editors	vim, nano
Shell Utilities	curl, wget, jq, rsync, rclone, zip/unzip, zstd

GPU & Compute Support

Feature	Description
CUDA	Full development toolkit with cuDNN (NVIDIA CUDA variants)
ROCm	Complete ROCm development environment (AMD variants)
OpenCL	Headers, ICD loaders, and runtime for both NVIDIA and AMD
Vulkan	Runtime and tools
NVIDIA Extras	OpenGL, video encode/decode libraries (auto-selected for driver compatibility)
Infiniband	rdma-core, libibverbs, infiniband-diags for high-speed networking

System Monitoring

Tool	Purpose
htop	Interactive process viewer
nvtop	GPU process monitoring
iotop	I/O usage monitoring

Pre-configured Applications

Application	Description	Default Port
Jupyter	Interactive Python notebooks	8080
Syncthing	Peer-to-peer file synchronization	8384
Tensorboard	ML experiment visualization	6006
Cron	Task scheduling	-

All applications are managed by Supervisor with configs in /etc/supervisor/conf.d/.

Additional Tools

Tool	Purpose
Vast CLI	Manage Vast.ai instances from the command line
magic-wormhole	Secure file transfer between machines
Syncthing	Keep files synchronized across instances
rclone	Cloud storage management

User Configuration

Non-root user: user account (UID 1001) with passwordless sudo
Shared permissions: umask 002 for collaborative file access
SSH key propagation: Keys automatically set up for both root and user accounts

Instance Portal

The Instance Portal is a web-based dashboard for managing applications running on your instance. It provides secure access through TLS, authentication, and Cloudflare tunnels.

Access the portal by clicking "Open" on your instance card in the Vast.ai console. See the Instance Portal documentation for complete details.

PORTAL_CONFIG

The PORTAL_CONFIG environment variable defines which applications appear in the Instance Portal. Format:

hostname:external_port:internal_port:path:name|hostname:external_port:internal_port:path:name|...

Field	Description
`hostname`	Usually `localhost`
`external_port`	Port exposed via `-p` flag (must be open in template)
`internal_port`	Port where your application listens
`path`	URL path for access (usually `/`)
`name`	Display name in the portal

Example:

PORTAL_CONFIG="localhost:1111:11111:/:Instance Portal|localhost:8080:18080:/:Jupyter|localhost:7860:17860:/:My App"

Port behavior:

When external_port ≠ internal_port: Caddy reverse proxy makes the app available on the external port with TLS and authentication
When external_port = internal_port: The application bypasses proxying (direct access), but tunnel links are still created

The configuration is written to /etc/portal.yaml on first boot. You can edit this file at runtime and restart Caddy with supervisorctl restart caddy.

Enabling HTTPS

To enable HTTPS for all proxied applications, set:

ENABLE_HTTPS=true

When enabled:

Caddy serves applications over HTTPS using certificates at /etc/instance.crt and /etc/instance.key
Self-signed certificates are generated automatically during boot
Install the Vast.ai Jupyter certificate locally to avoid browser warnings

Authentication

Authentication is enabled by default for all proxied ports. Access methods:

Open Button: Click "Open" on your instance card—automatically sets an auth cookie
Basic Auth: Username vastai, password is your OPEN_BUTTON_TOKEN
Bearer Token: Include Authorization: Bearer ${OPEN_BUTTON_TOKEN} header for API access

Related variables:

Variable	Description
`ENABLE_AUTH`	Set to `false` to disable authentication (default: `true`)
`AUTH_EXCLUDE`	Comma-separated list of external ports to exclude from auth
`WEB_USERNAME`	Custom username for basic auth (default: `vastai`)
`WEB_PASSWORD`	Custom password (default: auto-generated or `OPEN_BUTTON_TOKEN`)

Cloudflare Tunnels

The Instance Portal automatically creates Cloudflare tunnels for your applications, providing URLs like:

https://four-random-words.trycloudflare.com

For persistent custom domains, set CF_TUNNEL_TOKEN to your Cloudflare tunnel token. Note: Each running instance requires a separate tunnel token.

Startup Configuration

Entrypoint Arguments

The default boot script (/opt/instance-tools/bin/boot_default.sh) accepts these arguments:

Argument	Description
`--no-user-keys`	Skip SSH key propagation to the `user` account
`--no-export-env`	Don't export environment variables to `/etc/environment`
`--no-cert-gen`	Skip TLS certificate generation
`--no-update-portal`	Don't check for Instance Portal updates
`--no-update-vast`	Don't check for Vast CLI updates
`--no-activate-pyenv`	Don't activate Python environment in shell
`--sync-environment`	Sync Python/Conda environments to workspace volume for persistence
`--sync-home`	Sync home directories to workspace
`--jupyter-override`	Force Jupyter to start even in non-Jupyter launch modes

Example (in Docker run command or template):

/opt/instance-tools/bin/entrypoint.sh --sync-environment --no-update-portal

Startup Environment Variables

Variable	Description
`BOOT_SCRIPT`	URL to a custom boot script that replaces the entire default startup routine
`HOTFIX_SCRIPT`	URL to a script that runs very early in boot, before most initialization—use to patch broken containers
`PROVISIONING_SCRIPT`	URL to a script that runs after Supervisor starts—use to install packages and configure applications
`SERVERLESS`	Set to `true` to skip update checks for faster cold starts

Execution order:

BOOT_SCRIPT (if set, replaces everything below)
HOTFIX_SCRIPT (runs first, can modify any part of startup)
Normal boot sequence (environment setup, workspace sync, TLS certs, Supervisor)
PROVISIONING_SCRIPT (runs after Supervisor, installs your customizations)

Custom Boot Scripts

For derivative images, you can add custom scripts to /etc/vast_boot.d/ to hook into the boot sequence. Scripts are sourced in alphabetical order by filename, so use numeric prefixes to control ordering:

/etc/vast_boot.d/
├── 10-prep-env.sh
├── 25-first-boot.sh
├── 35-sync-home-dirs.sh
├── ...
├── 65-supervisor-launch.sh
├── 75-provisioning-script.sh
├── 80-my-custom-script.sh       # Runs every boot
└── first_boot/
    ├── 05-update-vast.sh        # Runs only on first boot
    └── 20-my-first-boot.sh      # Your first-boot script

Every boot: Add scripts directly to /etc/vast_boot.d/
First boot only: Add scripts to /etc/vast_boot.d/first_boot/

Key ordering points:

Scripts before 65-* run before Supervisor starts
Scripts at 75-* or later run after Supervisor is running
First-boot scripts (in first_boot/) are sourced at position 25-*

Provisioning Script Example

For quick customizations without building a new image:

#!/bin/bash
set -eo pipefail

# Activate the main virtual environment
. /venv/main/bin/activate

# Install packages
pip install torch transformers

# Download models
hf download meta-llama/Llama-2-7b-hf --local-dir /workspace/models

# Add a new application to Supervisor (see "Adding Custom Applications" for details)
cat > /etc/supervisor/conf.d/my-app.conf << 'EOF'
[program:my-app]
environment=PROC_NAME="%(program_name)s"
command=/opt/supervisor-scripts/my-app.sh
autostart=true
autorestart=true
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=true
EOF

# Reload Supervisor to pick up new config
supervisorctl reread && supervisorctl update

Building a Derived Image (Recommended)

The best way to create a custom image is to extend our pre-built images from DockerHub. This preserves the layer caching benefits—Vast.ai hosts already have our base layers cached, so only your custom layers need to be downloaded.

Workspace Directory

The contents of /opt/workspace-internal/ are copied to $WORKSPACE (default /workspace) on first boot. This happens because:

/workspace may be a volume mount
Even if not mounted, copying moves content to the uppermost OverlayFS layer, enabling effective use of Vast's copy tools

For large models: Don't place large files directly in /opt/workspace-internal/. Instead, store them elsewhere (e.g., /models/) and create symlinks:

# Store large model outside workspace-internal
RUN hf download stabilityai/stable-diffusion-xl-base-1.0 --local-dir /models/sdxl-base

# Create symlink in workspace-internal
RUN mkdir -p /opt/workspace-internal/ComfyUI/models/checkpoints && \
    ln -s /models/sdxl-base /opt/workspace-internal/ComfyUI/models/checkpoints/sdxl-base

This avoids duplicating large files when they're copied to the workspace volume.

Example Dockerfile

# Extend the pre-built image to preserve layer caching benefits
FROM vastai/base-image:cuda-12.8.1-cudnn-devel-ubuntu22.04

# Install Python packages into the main virtual environment
RUN . /venv/main/bin/activate && \
    pip install torch torchvision torchaudio transformers accelerate

# Download a large model to a location outside workspace-internal
RUN . /venv/main/bin/activate && \
    hf download stabilityai/stable-diffusion-xl-base-1.0 --local-dir /models/sdxl-base

# Create symlink so it appears in workspace
RUN mkdir -p /opt/workspace-internal/models && \
    ln -s /models/sdxl-base /opt/workspace-internal/models/sdxl-base

# Add a custom application managed by Supervisor
COPY my-app.conf /etc/supervisor/conf.d/
COPY my-app.sh /opt/supervisor-scripts/
RUN chmod +x /opt/supervisor-scripts/my-app.sh

# Configure Instance Portal to include your app
ENV PORTAL_CONFIG="localhost:1111:11111:/:Instance Portal|localhost:8080:18080:/:Jupyter|localhost:7860:17860:/:My App"

Adding Custom Applications

Supervisor manages all long-running applications. To add your own:

1. Create a Supervisor config (my-app.conf):

[program:my-app]
environment=PROC_NAME="%(program_name)s"
command=/opt/supervisor-scripts/my-app.sh
autostart=true
autorestart=true
# IMPORTANT: Log to stdout for Vast.ai logging integration
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
redirect_stderr=true

Note: Always configure stdout_logfile=/dev/stdout and redirect_stderr=true. If you log directly to files, output won't appear in Vast.ai's logging system.

2. Create a wrapper script (my-app.sh):

#!/bin/bash

# Import logging utilities for Portal log viewer
utils=/opt/supervisor-scripts/utils
. "${utils}/logging.sh"
. "${utils}/environment.sh"

# Activate the virtual environment
source /venv/main/bin/activate

# Run your application (bind to localhost, Caddy handles external access)
exec python /opt/my-app/main.py --host localhost --port 17860

The logging utilities in /opt/supervisor-scripts/utils/ handle:

logging.sh - Tees output to /var/log/portal/${PROC_NAME}.log for the Portal log viewer
environment.sh - Sets up common environment variables
exit_portal.sh - Checks if the app is configured in portal.yaml before starting

Key Paths

Path	Purpose
`/venv/main/`	Primary Python virtual environment (Conda-managed)
`/workspace/`	Persistent workspace directory
`/opt/workspace-internal/`	Contents copied to `/workspace` on first boot
`/etc/supervisor/conf.d/`	Supervisor service configurations
`/opt/supervisor-scripts/`	Service wrapper scripts
`/opt/supervisor-scripts/utils/`	Shared utilities for logging, environment setup
`/etc/portal.yaml`	Instance Portal configuration (generated from `PORTAL_CONFIG`)
`/var/log/portal/`	Application logs (viewable in Instance Portal)
`/etc/instance.crt`, `/etc/instance.key`	TLS certificates

Environment Variables Reference

Instance Portal

Variable	Description
`PORTAL_CONFIG`	Application configuration (see PORTAL_CONFIG)
`ENABLE_HTTPS`	Enable HTTPS for proxied applications (default: `false`)
`ENABLE_AUTH`	Enable authentication (default: `true`)
`AUTH_EXCLUDE`	Comma-separated ports to exclude from authentication
`WEB_USERNAME`	Basic auth username (default: `vastai`)
`WEB_PASSWORD`	Basic auth password (default: `OPEN_BUTTON_TOKEN`)
`CF_TUNNEL_TOKEN`	Cloudflare tunnel token for custom domains

Startup

Variable	Description
`BOOT_SCRIPT`	URL to custom boot script (replaces default startup)
`HOTFIX_SCRIPT`	URL to early-run patch script
`PROVISIONING_SCRIPT`	URL to post-Supervisor setup script
`SERVERLESS`	Set to `true` for faster cold starts

Applications

Variable	Description
`TENSORBOARD_LOG_DIR`	Tensorboard log directory (default: `/workspace`)

Building From Source (Not Recommended)

Note: Building from source creates new image layers that won't be cached on Vast.ai hosts. For most use cases, extending our pre-built images is faster and more efficient.

If you need to modify the base image itself (not just add layers on top), you can build from source:

git clone https://github.com/vast-ai/base-image
cd base-image

# Build a specific variant
docker buildx build \
    --build-arg BASE_IMAGE=nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04 \
    --build-arg PYTHON_VERSION=3.11 \
    -t my-base-image .

# Or use the build script for all variants
./build.sh --filter cuda-12.8 --dry-run  # Preview what would be built
./build.sh --filter cuda-12.8            # Build CUDA 12.8 variants
./build.sh --list                        # Show all available configurations

License

See LICENSE.md for details.

Name		Name	Last commit message	Last commit date
Latest commit History 371 Commits
ROOT		ROOT
derivatives		derivatives
docs/images		docs/images
portal-aio		portal-aio
provisioning_scripts		provisioning_scripts
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
README.template.md		README.template.md
build.sh		build.sh

License

vast-ai/base-image

Folders and files

Latest commit

History

Repository files navigation

Vast.ai Base Docker Image

Why This Image?

Optimized for Fast Startup Through Layer Caching

Automatic CUDA Version Selection

Available Image Variants

Base Images

Python Versions

Tag Format

Features

Development Environment

GPU & Compute Support

System Monitoring

Pre-configured Applications

Additional Tools

User Configuration

Instance Portal

PORTAL_CONFIG

Enabling HTTPS

Authentication

Cloudflare Tunnels

Startup Configuration

Entrypoint Arguments

Startup Environment Variables

Custom Boot Scripts

Provisioning Script Example

Building a Derived Image (Recommended)

Workspace Directory

Example Dockerfile

Adding Custom Applications

Key Paths

Environment Variables Reference

Instance Portal

Startup

Applications

Building From Source (Not Recommended)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors 5

Languages

Packages