Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast pull the full images in parallel without lazy loading #195

Open
1 task
shuaichang opened this issue May 23, 2023 · 1 comment
Open
1 task

Fast pull the full images in parallel without lazy loading #195

shuaichang opened this issue May 23, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@shuaichang
Copy link

shuaichang commented May 23, 2023

What is the version of your Accelerated Container Image

No response

What would you like to be added?

Overlaydb is great at accelerating container image pulling and we've enjoyed the benefit and appreciate all the support from the community!

Why is this needed for Accelerated Container Image?

Problems

The ondemand data transfer and trace based prefetch are great tools, however, we do see another gap that can be filled in between: fast prefetch of all blobs.

The following are the reasons:

  1. For some applications, lazy load would change application behavior. One example would be K8s workload with startup/liveness/ready probes, that before doing lazy pulling, they can start up with no issue. After onboarding to lazy pull, they fail to start as the previous probe period is not long enough. This makes some application hard to adopt OverlayBD without changing config.
  2. It's not easy to observe the overall latency of image pull as the time has been attributed to application startup time. It also introduced new failure mode that previously we won't run application unless image pull is successful. With lazy pull, it could result in runtime IO hang or other errors hard to debug (this is especially hard for different teams owning application and the container/image runtime infra)
  3. Download full blobs can also be fast, only decompression is slow. Given OverlayBD images decompression is super fast. With high concurrency, we were able to saturate the VM bandwidth and download a multi-GB OverlayBD images in several seconds.

I am aware that the trace based prefetch would make this issue much better, but it can be costly to add the trace record CI/CD build system in a large scaled Infra with many dependencies.

Therefore, I feel if OverlayBD has a feature that is between lazy loading and trace based prefetch (let's just call it Prefetch), then it will be a perfect solution without require too much learning curve and courage to adopt (Problem 2 is a pretty big mindset shift that can slow down adoption)

Options

We propose some options here, please feel free to also add

  • Option 1 (what we are trying now): some external_image_puller to pull blobs from registry in parallel, this can be quick fast when VM network is saturated. After which, we put the blobs into registry_cache directory
    • Pros:
      • relatively easy to implement, no OverlayBD side changes required
      • Flexible for users (us) to tune performance
    • Cons:
      • Will it be thread safe as both overlaybd-tcmu and the external_image_puller might write to registry_cache, will overlaybd-tcmu be able to detect new blob caches added by external_image_puller?
      • Not an OverlayBD feature, cannot be reused by the community
      • Is there a good way to validate the integrity of the image?
  • Option 2: OverlayBD support prefetch with parallelism. (OverlayBD already supports rpull --download-blobs for prefetch full image. However, the performance is pretty slow because 1) it performs unnecessary apply, which is part of the containerd pull image library code 2) the blobs are pulls sequentially, which is slow.)
  • To make it fast, if the rpull also support downloads blobs in chunks in parallel and only return success if the image is fully downloaded.

Please feel free to also contribute ideas. Again, we appreciate all the great work from OverlayBD community. By contributing real world use cases and requirement, hopefully, we can also help drive OverlayBD adoptions.

Thanks!

Are you willing to submit PRs to contribute to this feature?

  • Yes, I am willing to implement it.
@shuaichang shuaichang added the enhancement New feature or request label May 23, 2023
@liulanzheng
Copy link
Member

liulanzheng commented May 25, 2023

@shuaichang
I think in general, there are two implementation paths:

  1. in containerd: use rpull --download-blobs for pulling overlaybd images but two improvements needed to be done. one is parallel downloading single block in chunks to speed on download speed. the other is to remove the process of untar/decompression from content store to snapshot.
  2. in overlaybd: use cache type of download, and full speed background download with no delay or full speed prefetch. and also i think external_image_puller is feasible for cache type download that external_image_puller downloads blobs and write into corresponding snapshots directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants