You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running large clusters, situations can arise where the same image is being pulled from the same node. These happens especially during rollouts of new deployments where initially a few images will have pulled the image. In small clusters this is generally not a problem as the pressure on individual nodes is fairly limited. In large clusters however we can have hundreds of nodes pulling from the same node. As the underlying VM has limited network bandwidth the pulling of images will become slower and slower. Which could cause all image pulls to fail. It would be a lot more preferable to allow a few nodes to pull the image faster so that they also can start distributing the image.
Proposed solution to the problem
The easy solution would be to limit the amount of in flight requests to a node. This would however not cover the fact that different layers are of different size. Another option would be to limit the total amount of bytes that can be served, and deny any further requests. The third option would be to set a cap on the bandwidth when serving the layers so that new requests do not slow down in flight requests.
Describe the problem to be solved
When running large clusters, situations can arise where the same image is being pulled from the same node. These happens especially during rollouts of new deployments where initially a few images will have pulled the image. In small clusters this is generally not a problem as the pressure on individual nodes is fairly limited. In large clusters however we can have hundreds of nodes pulling from the same node. As the underlying VM has limited network bandwidth the pulling of images will become slower and slower. Which could cause all image pulls to fail. It would be a lot more preferable to allow a few nodes to pull the image faster so that they also can start distributing the image.
Proposed solution to the problem
The easy solution would be to limit the amount of in flight requests to a node. This would however not cover the fact that different layers are of different size. Another option would be to limit the total amount of bytes that can be served, and deny any further requests. The third option would be to set a cap on the bandwidth when serving the layers so that new requests do not slow down in flight requests.
Relates to #551 and #530
The text was updated successfully, but these errors were encountered: