Skip to content

[modelcars] add role to build image #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/toolbox.generated/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -326,4 +326,5 @@ Toolbox Documentation

* :doc:`deploy_aws_efs <Storage.deploy_aws_efs>` Deploy AWS EFS CSI driver and configure AWS accordingly.
* :doc:`deploy_nfs_provisioner <Storage.deploy_nfs_provisioner>` Deploy NFS Provisioner
* :doc:`download_to_image <Storage.download_to_image>` Downloads the a dataset into an image in the internal registry
* :doc:`download_to_pvc <Storage.download_to_pvc>` Downloads the a dataset into a PVC of the cluster
29 changes: 29 additions & 0 deletions projects/storage/toolbox/storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,32 @@ def download_to_pvc(
"""

return RunAnsibleRole(locals())

@AnsibleRole("storage_download_to_image")
@AnsibleMappedParams
def download_to_image(
self,
source,
image_name,
namespace,
image_tag="latest",
org_name="modelcars",
creds="",
storage_dir="/",
base_image="registry.access.redhat.com/ubi9/ubi",
):
"""
Downloads the a dataset into an image in the internal registry

Args:
source: URL of the source data
image_name: Name of the imagestream that will be create or used to store the dataset files.
namespace: Name of the namespace in which the imagestream will be created
image_tag: tag to push the image with
org_name: image will be pushed to <org_name>/<image_name>:<image_tag>
creds: Path to credentials to use for accessing the dataset.
storage_dir: path to the data in the final image
base_image: base image for the image containing the data
"""

return RunAnsibleRole(locals())
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Auto-generated file, do not edit manually ...
# Toolbox generate command: repo generate_ansible_default_settings
# Source component: Storage.download_to_image

# Parameters
# URL of the source data
# Mandatory value
storage_download_to_image_source:

# Name of the imagestream that will be create or used to store the dataset files.
# Mandatory value
storage_download_to_image_image_name:

# Name of the namespace in which the imagestream will be created
# Mandatory value
storage_download_to_image_namespace:

# tag to push the image with
storage_download_to_image_image_tag: latest

# image will be pushed to <org_name>/<image_name>:<image_tag>
storage_download_to_image_org_name: modelcars

# Path to credentials to use for accessing the dataset.
storage_download_to_image_creds:

# path to the data in the final image
storage_download_to_image_storage_dir: /

# base image for the image containing the data
storage_download_to_image_base_image: registry.access.redhat.com/ubi9/ubi
171 changes: 171 additions & 0 deletions projects/storage/toolbox/storage_download_to_image/files/entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
#! /bin/bash

set -e
set -u
set -o pipefail
set -x

echo "Assembling registry auth file"
LOCAL_AUTH_FILE=/var/run/secrets/openshift.io/push/.dockercfg

REGISTRY_AUTH=/tmp/.dockercfg_local
(echo "{ \"auths\": "; cat "$LOCAL_AUTH_FILE"; echo "}") > $REGISTRY_AUTH
stat $REGISTRY_AUTH

echo "---"

if [[ -z "${STORAGE_DIR:-}" ]]; then
STORAGE_DIR=/storage
fi

mkdir -p "/storage"
chmod ugo+w "/storage" || true

echo "---"

df -h "/storage/"

echo "---"

if [[ "$DOWNLOAD_SOURCE" == "https://huggingface.co/"* ]];
then
dnf install --quiet -y git-lfs

if [[ "${CRED_FILE:-}" ]];
then
echo "Enabling git 'store' credential helper ..."
sha256sum "${CRED_FILE}"

git config --global credential.helper "store --file=$CRED_FILE"
else
echo "No credential file passed."
fi

if ! time git clone "$DOWNLOAD_SOURCE" "/storage/${SOURCE_NAME}" --depth=1 \
|& grep -v 'unable to get credential storage lock in 1000 ms: Read-only file system'
then
rm -rf "/storage/${SOURCE_NAME}"
echo "Clone failed :/"
exit 1
fi
rm -rf "/storage/${SOURCE_NAME}/.git"

elif [[ "$DOWNLOAD_SOURCE" == "s3://"* ]];
then
if [[ -z "${CRED_FILE:-}" ]];
then
echo "ERROR: no credentials provided :/"
exit 1
fi
if [[ ! -f "${CRED_FILE}" ]]; then
echo "ERROR: credentials file does not exist :/"
exit 1
fi

dnf install --quiet -y unzip

echo "Building AWS cli ..."
curl -Ssf "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip -q awscliv2.zip
./aws/install

export AWS_SHARED_CREDENTIALS_FILE=$CRED_FILE

if ! time aws s3 cp "$DOWNLOAD_SOURCE" "/storage/${SOURCE_NAME}" --recursive --quiet
then
rm -rf "/storage/${SOURCE_NAME}"
echo "Copy failed :/"
exit 1
fi
elif [[ "$DOWNLOAD_SOURCE" == "dmf://"* ]];
then
# CRED_FILE --> github token
# CRED_FILE_2 --> DMF JSON token

if [[ -z "${CRED_FILE:-}" ]];
then
echo "ERROR: no credentials provided :/"
exit 1
fi
if [[ ! -f "${CRED_FILE}" ]]; then
echo "ERROR: credentials file does not exist :/"
exit 1
fi

DMF_VERSION=1.7.1
DMF_WHL_NAME=dmf_lib-${DMF_VERSION}-py3-none-any.whl
DMF_WHL_FILE="/storage/../binaries/$DMF_WHL_NAME"
if [[ ! -f "$DMF_WHL_FILE" ]]; then
echo "ERROR: DMF wheel file does not exist ($DMF_WHL_FILE) :/"
exit 1
fi

export PATH=$HOME/.local/bin:$PATH
pip install --quiet "$DMF_WHL_FILE"

cat > lh-conf.yaml <<EOF
lakehouse:
environment: PROD
# please generate your token from https://watsonx-data.cash.sl.cloud9.ibm.com/token
token: $(cat ${CRED_FILE})
EOF

namespace=$(echo "$DOWNLOAD_SOURCE" | cut -d/ -f3)
model_label=$(echo "$DOWNLOAD_SOURCE" | cut -d/ -f4)

dmf model ls -n "$namespace" "$model_label" -t model_shared

time dmf model pull -n "$namespace" "$model_label" -t model_shared --dir $(realpath "$STORAGE_DIR/${SOURCE_NAME}.tmp")
# this ^^^ stores the model in $dir/$model_label.$revision ...

echo "Moving the model to its final storage location ..."
mv "/storage/${SOURCE_NAME}.tmp"/* "/storage/${SOURCE_NAME}/"
rmdir "/storage/${SOURCE_NAME}.tmp"

else
cd "/storage/"

echo "Downloading $DOWNLOAD_SOURCE ..."

if ! time curl -O \
--silent --fail --show-error \
"${DOWNLOAD_SOURCE}";
then
echo "FATAL: failed to download from ${DOWNLOAD_SOURCE} ..."
exit 1
fi
fi

echo "All done!"

cd "/storage/"

time find "./${SOURCE_NAME}" ! -path '*/.git/*' -type f -exec sha256sum {} \; | tee -a "${SOURCE_NAME}.sha256sum"

echo "---"

du -sh "./${SOURCE_NAME}"

echo "---"

df -h "/storage/"

echo "---"

echo "Building the image"

cat > /tmp/Containerfile <<EOF
FROM ${BASE_IMAGE}
RUN ls /storage/
RUN mkdir -p ${STORAGE_DIR}
RUN cp -r /storage/* ${STORAGE_DIR}/.
RUN ls ${STORAGE_DIR}
EOF

export STORAGE_DRIVER=vfs
podman build --isolation=chroot -f /tmp/Containerfile -v /storage:/storage:Z -t ${SOURCE_NAME}:${IMAGE_TAG} .
#rm -rf /var/lib/containers/storage
podman images
podman push --tls-verify=false --authfile=$REGISTRY_AUTH localhost/${SOURCE_NAME}:${IMAGE_TAG} $REMOTE_IMAGE

exit 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
dependencies:
- role: check_deps
Loading
Loading