Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Artifact Repository #2 #1662

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

BryceGattis
Copy link
Contributor

@BryceGattis BryceGattis commented Feb 18, 2024

This PR picks up from the work that work that @ttrently started in PR #1453.

Requirements

This implementation requires some additional packages that have not been included in rez.vendor.

  • awscli
  • pymongo==4.6.2

These can be installed via python -m pip install PKG --target path\to\rez\Lib\site-packages.

Setup

For testing I setup Mongo and Mongo Express docker containers locally by running the following commands:

  • docker run --detach --name mongo_rez --network rez -p 27017:27017 mongo:7.0.5
  • docker run --detach --name mongo_rez_express --network rez -e ME_CONFIG_MONGODB_SERVER=mongo_rez -p 8081:8081 mongo-express:1.0.2-20

I ran the Mongo Express container so I could easily see the contents of the DB in the browser.

How To Use

In order to use you will need an existing S3 bucket and a MongoDB instance with a named database and a collection called "packages".

In your rezconfig.py you will need to set the following:

packages_path = [
    "mongodb@localhost:27017/db_name",
]

artifacts_path = [
    "s3@s3://bucket-1/packages",
]

default_cachable = True
package_cache_same_device = True
package_cache_async  = False

Please note that the above Mongo URI will not directly be usable by PyMongo in a Python Console.

Steps

  1. Run rez build --install -p mongo@localhost:27017/db_name --clean to build a package and put its definition in the MongoDB. TODO: Running this seems to upload to DB now but install_variant doesn't fully work yet.

Changes

  • Made package caching optionally asynchronous. TODO: Not sure why this choice was made yet.

FUTURE

The following changes are not handled by this PR, but should be kept in mind for the future.

  • Introduce new terminology for variants:
    • Download: Pull a variant payload from the Artifact Repository to disk somehow? TODO: Clarify this.
    • Install: Unpacks the payload if necessary (if it's an archive file format), and caches it using payload caching? TODO: Investigate this.
  • For rez-release do we need to make release_packages_path a list so we can deploy to potentially multiple package repositories at once?

TODO:

  • rez-env can't find packages in the mongo package repository yet.
  • No Credential Provider class as described in https://gist.github.com/JeanChristopheMorinPerso/9a0705d1498378178b53061d90f3b2d4
    need some help with modern practices for credentials management and what our workflow should be here.
  • Check if this packs the payload before uploading to the Artifact Repository (convert to .zip)?
  • Figure out how to handle the two new PyPi package requirements:
    • awscli
    • pymongo
  • Hopefully we don't have to add these to the rez/vendor folder, but maybe that's a good first thing to do?
  • Define rez release behavior. Does this push to both to package repository and artifact repository? What does this do if there are multiple of either of those?
  • Need to clarify where exactly ArtifactRepositories belong in the architecture of Rez. Do we need a ArtifactRepository for packages on a shared filesystem or do we only use ArtifactRepositories when they are off premise?
  • Add new tests.

ttrently and others added 2 commits February 18, 2024 13:14
Adds the artifact repository concept with an example MongoDB / S3 setup.

Signed-off-by: ttrently <[email protected]>
@BryceGattis BryceGattis requested a review from a team as a code owner February 18, 2024 19:42
@@ -166,12 +165,6 @@ def add_variant(self, variant, force=False):
% variant.uri
)

if not os.path.isdir(variant_root):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the culprit for why the tests are breaking. We need to have a generic way to check if we can access the variant_root regardless of what type of PackageRepository is used (ex. database/filesystem).

Copy link
Contributor Author

@BryceGattis BryceGattis Feb 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the immediate issue by readding this check and checking if the repository associated with a package is "filesystem", but it's not fully fleshed out yet.

@@ -303,6 +304,11 @@ def _uri(self):
"""
raise NotImplementedError

def install(self, location):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this even be here? Shouldn't it be the responsibility of the ArtifactRepository to know how to install a resource? Wouldn't this break if we're using s3 as an artifact repository since it can't just run a basic copytree from s3?

Copy link

codecov bot commented Mar 3, 2024

Codecov Report

Attention: Patch coverage is 62.96296% with 20 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@9c8de33). Click here to learn what that means.

Files Patch % Lines
src/rez/artifact_repository.py 52.94% 16 Missing ⚠️
src/rez/package_cache.py 71.42% 1 Missing and 1 partial ⚠️
src/rez/rezconfig.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1662   +/-   ##
=======================================
  Coverage        ?   58.02%           
=======================================
  Files           ?      129           
  Lines           ?    17133           
  Branches        ?     3505           
=======================================
  Hits            ?     9942           
  Misses          ?     6523           
  Partials        ?      668           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@BryceGattis
Copy link
Contributor Author

BryceGattis commented Mar 3, 2024

Thought for potential backlog item: Can we use a different data validation library or perhaps update our vendored schema package? The output you get out of it currently is borderline unreadable to a human. For example here's a snippet of a traceback I got while working on this PR:

  File "D:\Code\rez\venv\lib\site-packages\rez\vendor\schema\schema.py", line 235, in validate
    raise SchemaError([None] + x.autos, [e] + x.errors)
rez.vendor.schema.schema.SchemaError: Or(<class 'rez.utils.sourcecode.SourceCode'>, Schema({Optional(<class 'str'>): Or(Or(<class 'str'>, [<class 'str'>]), {Optional(<class 'str'>): <class 'object'>, 'command': Or(<class 'str'>, [<class 'str'>]), Optional('requires'): [Or(<class 'str'>, And(<class 'rez.util
s.formatting.PackageRequest'>, Use(<class 'str'>)))], Optional('run_on'): Or(<class 'str'>, [<class 'str'>]), Optional('on_variants'): Or(<class 'bool'>, {'type': 'requires', 'value': [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))]})})})) did not validate 'None'  
'None' should be instance of {Optional(<class 'str'>): Or(Or(<class 'str'>, [<class 'str'>]), {Optional(<class 'str'>): <class 'object'>, 'command': Or(<class 'str'>, [<class 'str'>]), Optional('requires'): [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))], Optional
('run_on'): Or(<class 'str'>, [<class 'str'>]), Optional('on_variants'): Or(<class 'bool'>, {'type': 'requires', 'value': [Or(<class 'str'>, And(<class 'rez.utils.formatting.PackageRequest'>, Use(<class 'str'>)))]})})}

@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Mar 3, 2024

@BryceGattis I agree the errors are awful. I have started to look at replacing our schema stuff with something more modern. But it's a difficult task because the schemas leak into our public API is in some places, the rez internals (the core ones) are all very tightly coupled to schemas and it is also very performance sensitive. It is probably positive to swap our schema library with something else without introducing breaking changes, but I'll be a tough task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants