-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experience running this plugin with gRPCRoutes in a linkerd-meshed cluster #75
Comments
One thing we did run into while setting this up was the CRD incompatibility between linkerd-crds and this plugin. linkerd-crds installs httpRoute |
So we've deployed this all to our cluster, but I'm having a hard time verifying if it's working as we're only using the HTTP/GRPC routes for traffic routing during canary rollouts. Our linkerd-proxy metrics show no data for routes. I'm going to try to create a minimal repro locally with kind to see if it actually uses the HTTP/GRPCroutes. |
Well I just got to try HTTPRoute using the The way it works:
With this, it seems to work flawlessly! I'm going to test grpcroutes next and ensure they work as well |
Thank you @FredrikAugust for feedback!🙏 |
No worries. Status now is that I've confirmed it works fine with Traefik -> Linkerd + this plugin with Argo rollouts. What's missing is testing that GRPC works as it should which is a little more tricky as Traefik as per now doesn't support GRPCRoutes. I'll try to test this tomorrow by running a simple application which connects to |
Okay, so I got around to creating the helper tool: https://github.com/kvist-no/grpc-lb-tester. It does two things, every
And the verdict is that it all seems to work. I first set up two backends for stable, and ensured that they each got ~50% traffic (this is controlled by LB algo of l5d). Then I triggered a rollout upgrade and set the steps to
When it paused after 50% I ensured that, again the (# 1) canary pod would get ~50% traffic and the (# 2) stable ones got the other 50%. Then I promoted the rollout and the weight flipped to 100% stable and 0% canary, and the traffic routed accordingly. I also tried Secondly, I tested the For testing HTTP, I simply ran I don't think there is anything left to test from the subset of functionality that we will use, but I can loop back if we encounter any problems in production. Thank you for the great plugin, it works very well! I hope we can see a release of the gRPC functionality soon 🙌 |
I'll update this once I get a response from linkerd in regards to support for grpcroute v1. |
So I've gotten a response from linkerd, and they want to support v1, but don't have a timeline for it per now. linkerd/linkerd2#13032 |
@FredrikAugust 0.4.0 was just released and it includes grpc support https://github.com/argoproj-labs/rollouts-plugin-trafficrouter-gatewayapi/releases/tag/v0.4.0 |
Awesome, @kostis-codefresh! I don't think we'll be able to test it before linkerd upgrades to stable though, unless there is a way to configure the version used in this plugin — which I don't think there is. Would it be a bad idea to allow to control the api version used through an environment variable? |
It appears we're among the first to test out this plugin with linkerd and grpcroutes so I thought I'd share some knowledge which might help others.
We're running a custom build (just from trunk) hosted in s3 and injecting that into argo rollouts using the helm chart:
Our grpcRoutes look something like this:
Our rollouts have the following canary strategy configuration:
We just rolled this out to our staging cluster, and the grpcRoutes seem to update just fine in realtime like they're supposed to. I'm going to try to get some metrics from linkerd to see how it all works and post that here within a couple of days.
We're running
linkerd-enterprise-control-plane
helm chart version2.16
which introduced support for retries in grpcroutes, which was our motivation for migrating everything over to the new gateway api.The text was updated successfully, but these errors were encountered: