Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon Sagemaker fails to extract the archived model.tar.gz #5

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

aroopgochhayat
Copy link

Amazon Sagemaker fails to extract the archived model.tar.gz for the following reason:

Failed to extract model data archive for container
SageMaker expects a TAR file with the model data for use in your endpoint. After SageMaker downloads the TAR file, the data archive is extracted. This error might occur if SageMaker can't extract this data archive. For example, SageMaker can't extract the data archive if the model artifact contains symbolic links for files located in the TAR file.

When you create an endpoint, make sure that the model artifacts don't include symbolic links within the TAR file. To check if the TAR file includes symbolic links, extract the model data, and then run the following command inside the artifacts:

find . -type l -ls
This command returns all the symbolic links found after searching through the current directory and any of its subdirectories. Replace any link that's returned with the actual copies of the file.

Ref: https://repost.aws/knowledge-center/sagemaker-endpoint-creation-fail

Added Changes:

  • Changed cache_dir to local_dir so the actual files are download, using cache_dir downloads blobs and then symlinks the files with those blobs which creates a tar.gz unrecognised by AWS sagemaker
  • Set local_dir_use_symlinks=False so no files are symbolic linked which may lead to failed model extraction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant