-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for comment: Scalable architecture #309
Comments
Thank you for the detailed proposal! I'm going to bring this up at our community meeting this Friday to get more opinions, you're welcome to join for that (10am EST at https://meet.google.com/jhk-cvuf-icd). My first impressions:
|
Thank you for the invitation. I plan to attend.
As noted in the Docker doc:
Composer repositories are also "not cohesive software distributions", and so ought not to be susceptible to mix-and-match attacks. In fact, one of Composer's principle functions is resolving dependency graphs, to dynamically determine a project-specific set of compatible package releases.
Thank you for saying so. I wonder if such a quasi-recursive use of TUF could allow for a more broadly distributed server-side architecture, in addition to overcoming scalability/performance bottlenecks.
This was an interesting read. I'm not familiar with the internals of OCI registries, so some of the specifics are lost on me. However, it is clear that we share a bottleneck w/ snapshot metadata. That said, Rugged is a (relatively) straight-forward implementation of the TUF Spec. My understanding of the principles underpinning TUF are nowhere near deep enough to consider a custom POUF. Also, while Rugged was designed to support the Drupal Association's use-case (and, by extension, Packagist's), it is intended to be agnostic wrt/ the content it is signing. As such, we aren't planning any integration with specific repo/package formats.
If a TAP turns out to be the best path forward, I'll be happy to help. That said, I'm not sure what's involved. So I'd need some guidance. |
Cc @kairoaraujo |
Problem/Motivation
TUF metadata scalability is becoming an issue for us, both on the client-side, and for high-volume repositories.
Background
I'm the lead maintainer of the Rugged TUF Server. My work has primarily been sponsored by the Drupal Association, in an effort to implement TUF on their Composer repository of ~15,000 packages (~150,000 releases, 165K targets). TUF metadata for these packages currently weighs-in at ~29M.
We are using hashed-bins, with 2048 bins atm, but we're experimenting with performance at difference sizes. We have not (yet) implemented succinct hashed bins, as that only reduces the size of
bins.json
, which never changes in normal operations, and so only represents a relatively small metadata overhead.We're aware of TAP 16 (https://github.com/theupdateframework/taps/blob/master/tap16.md) proposing Snapshot Merkle trees. This looks like it should address the problem of
snapshot.json
growing in line with the number of hashed bins. However, we do not believe that this will address the issues we're encountering (detailed below).The maintainers of Packagist.org are interested in providing TUF coverage for their ~400,000 packages, and over 4.5M releases (~6M targets). At their scale, they're seeing peaks of upward of 20 new releases per second. Also, the delays imposed by consistent snapshots are a non-starter for them.
Dependency resolution overhead
For our use case, with Composer (client-side), each target is named like so:
drupal/core-datetime/8.9.19.0
, to keep releases distinct. We also sign the Composer metadata (composer.json
) that accompanies each package (p2/drupal/core-datetime.json
?).When Composer is resolving dependencies, it must download many of these
composer.json
files. However, due to how hashed-bins distributes these files, they end up spread across multiple bins. As a result, it's likely that a project that uses even a relatively small number of packages will need to maintain a significant number ofbin_n
targets metadata, most of the contents of which will be irrelevant to the project.Even if we were to share locally-cached TUF metadata across multiple projects, it would still result in an almost complete copy of the entire TUF repository metadata.
Proposed solution
Instead of scaling-up a single TUF repository, we're proposing that we can scale-out to many smaller repositories (possibly one per package), using the same root metadata and signing keys.
Each TUF repository (of which there would be over 400k) would be very simple. Hashed bins would not be required, since each would only contain an average of 10-15 targets. There should never be enough targets to warrant hashed bins, since each repo only contains the releases from a single package. Even if it were required, we could implement hashed bin selectively on a per-package-repo basis
From the client-side, there would be overhead of downloading
timestamp.json
andsnapshot.json
for each package they are using, but both these files would be very small.targets.json
would scale with the number of releases. However, the client would never have to interact with any TUF metadata for packages not in use within their project.This seems somewhat similar to the architecture of Notary, where each "collection" appears to be a something like a stand-alone TUF repository.
This also appears to make parallelizing of server-side operations much simpler, since it removes the issue of parallel processes trying to write and sign the same metadata. However, this may be specific to Rugged's architecture.
Root Metadata
We initially thought that we might be able to keep a single
n.root.json
file for all these repos, but that'd present problems when rotating online keys.When rotating online keys, any metadata signed by the role whose key was rotated will need to be re-signed, which would take a non-trivial amount of time. As a result, we would want to be able to progressively rollout new root metadata (along with re-signed metadata).
So we expect to need to keep each repo's root metadata separately, even if they'll all be the same (most of the time).
Mirrors.json
We've looked at
mirrors.json
as a potential way of implementing something similar to the above, insofar as being able to effectively split up a single repository into namespaces. Butsnapshot.json
is still shared, and so this doesn't appear to be a fruitful path.Providing trusted root metadata
From §2.1.1. ("Root role"): (https://theupdateframework.github.io/specification/latest/#root):
Likewise, from §5.2 ("Load trusted root metadata"):
We cannot reasonably ship hundreds of thousands of root metadata with the client-side implementation. With a per-package layout, this would need to be updated frequently, as new packages are added.
To provide trusted root metadata for all of these TUF repos, we envision a "meta-repository" that provides TUF-signatures for these root metadata. The client thus ships with a single top-level root metadata, while being able to download and verify the initial root metadata for each of the "sub-repositories".
For the scale of a software repository the size of Packagist, this repository of keys could implement hashed bins, for performance reasons, as it would contain hundreds of thousands of targets (initial root metadata).
No global snapshot
Each sub-repo is a full TUF repo, providing timestamp and snapshot metadata. However, in this scenario, we do not have a single view of all packages. The stated purpose of the top-level
snapshot.json
is mitigating Mix-and-match (and similar) attacks. However, the very existence of this file is at the crux of the scalability challenges that we're observing (and hypothesizing).We believe this layout to remain reasonably resistant to such attack vectors. The top-level repo contains snapshot metadata covering each initial root metadata, while each sub-repo contains its own snapshot metadata. If this is deemed insufficient, we could maintain versioned root metadata (rather than just initial root metadata) as the targets of the top-level repo.
Our questions
The text was updated successfully, but these errors were encountered: