Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP test out new dockerfile with more nvidia tools #1557

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
36 changes: 36 additions & 0 deletions .github/workflows/beta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: beta-docker-images

on:
workflow_dispatch:
pull_request:

jobs:
build-axolotl-beta:
if: ${{ ! contains(github.event.commits[0].message, '[skip docker]]') && github.repository_owner == 'OpenAccess-AI-Collective' }}
strategy:
fail-fast: false
runs-on: axolotl-gpu-runner
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Docker metadata
id: metadata
uses: docker/metadata-action@v5
with:
images: winglian/axolotl-beta
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# guidance for testing before pushing: https://docs.docker.com/build/ci/github-actions/test-before-push/
- name: Build and export to Docker
uses: docker/build-push-action@v5
with:
context: .
file: ./docker/Dockerfile-beta
tags: |
${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}
6 changes: 6 additions & 0 deletions docker/Dockerfile-base
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,9 @@ RUN git lfs install --skip-repo && \
pip3 install awscli && \
# The base image ships with `pydantic==1.8.2` which is not working
pip3 install -U --no-cache-dir pydantic==1.10.10

WORKDIR /workspace

RUN git clone --depth=1 https://github.com/OpenAccess-AI-Collective/axolotl.git

WORKDIR /workspace/axolotl
54 changes: 54 additions & 0 deletions docker/Dockerfile-beta
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
FROM nvcr.io/nvidia/pytorch:24.03-py3

RUN apt update && apt install -y python3.10-venv git-lfs

RUN python3 -m pip install --upgrade pip && \
pip install packaging && \
pip uninstall -y torch-tensorrt && \
pip install -U torch==2.2.2 && \
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable

RUN groupadd axolotl && \
useradd -m -g axolotl -s /bin/bash axolotl && \
chown axolotl:axolotl /workspace

USER axolotl

RUN mkdir -p /home/axolotl/venv

RUN python -m venv --system-site-packages /home/axolotl/venv/axolotl

ENV PATH="/home/axolotl/venv/axolotl/bin:$PATH"

RUN echo "source /home/axolotl/venv/axolotl/bin/activate" >> /home/axolotl/.bashrc

RUN git lfs install --skip-repo

WORKDIR /workspace

RUN git clone --depth=1 https://github.com/OpenAccess-AI-Collective/axolotl.git

WORKDIR /workspace/axolotl

RUN pip install causal_conv1d && \
cd /workspace/axolotl && \
pip install -e .[deepspeed,flash-attn,mamba-ssm,galore]

# So we can test the Docker image
RUN pip install pytest

# fix so that git fetch/pull from remote works
RUN git config remote.origin.fetch "+refs/heads/*:refs/remotes/origin/*" && \
git config --get remote.origin.fetch

# helper for huggingface-login cli
RUN git config --global credential.helper store


ENV HF_DATASETS_CACHE="/workspace/data/huggingface-cache/datasets"
ENV HUGGINGFACE_HUB_CACHE="/workspace/data/huggingface-cache/hub"
ENV TRANSFORMERS_CACHE="/workspace/data/huggingface-cache/hub"
ENV HF_HOME="/workspace/data/huggingface-cache/hub"
ENV HF_HUB_ENABLE_HF_TRANSFER="1"

CMD ["sleep", "infinity"]
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def parse_requirements():
dependency_links=dependency_links,
extras_require={
"flash-attn": [
"flash-attn==2.5.5",
"flash-attn>=2.4.2",
],
"fused-dense-lib": [
"fused-dense-lib @ git+https://github.com/Dao-AILab/[email protected]#subdirectory=csrc/fused_dense_lib",
Expand Down
Loading