Skip to content

Conversation

sitole
Copy link
Member

@sitole sitole commented Sep 25, 2025

Docs related to Traefik and cooperation with Nomad (https://doc.traefik.io/traefik/reference/install-configuration/providers/hashicorp/nomad/).

Why?

  • Currently, we are mapping port numbers and port names back and forth to propagate them from the Nomad job to the VM instances and the backend user by the load balancer.
  • Simplify load balancer setup as it will expose just one port for ingress, which will make configuration easier for different cloud providers.
  • To expose the job to the load balancer, we currently need to define it as a host network. At the same time, if we need to perform a blue/green deployment, we require multiple nodes to ensure it is done safely. With ingress, we can have various APIs/client proxies, etc, on the same host during deployment.
  • When deploying internal services, the process of propagating additional VM ports, firewall, and load balancer configuration can easily cause misconfiguration errors, and it's not developer-friendly. With ingress, we can deploy the service to the cluster and define Traefik labels to route the new service as needed.
  • For future migrations, this can really help us to customize it as needed with custom temporal routing done in our own Traefik/Nginx instead of a load balancer that does not have access to business logic.

Points that need some care:

  • Nomad and Consul still need to be attached manually to LB, because otherwise, we would create a circular dependency.
  • We have some load balancer filtering and security rules, which can be transferred too; we need to investigate more around the catch-all hostname, as now we cannot distinguish between (api|docker|edge). and *. as both requests are routed to the same backend. We can work around that.
  • We need to set up some rules for how routing priority will be done. You cannot have multiple routers with the same priority. The lowest priority should be for sandbox traffic, as it serves as a catch-all. For example, path-based routes (as we are using for the events API) should have higher priority than the API itself.

@sitole sitole added the improvement Improvement for current functionality label Sep 25, 2025
@sitole
Copy link
Member Author

sitole commented Oct 9, 2025

Closing as we already have opened PR for partil migration to ingress (#1314).

@sitole sitole closed this Oct 9, 2025
@sitole sitole deleted the poc/ingress-controller branch October 9, 2025 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement for current functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant