Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP proxy for upstream health monitoring #37824

Open
taikai-zz opened this issue Dec 27, 2024 · 3 comments
Open

UDP proxy for upstream health monitoring #37824

taikai-zz opened this issue Dec 27, 2024 · 3 comments
Labels
area/health_checking area/udp area/upstream question Questions that are neither investigations, bugs, nor enhancements

Comments

@taikai-zz
Copy link

taikai-zz commented Dec 27, 2024

admin:
  access_log_path: /tmp/admin_access.log
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: UDP
        address: 0.0.0.0
        port_value: 1234
    udp_listener_config:
      downstream_socket_config:
        max_rx_datagram_size: 9000
    listener_filters:
    - name: envoy.filters.udp_listener.udp_proxy
      typed_config:
        '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.UdpProxyConfig
        stat_prefix: service
        matcher:
          on_no_match:
            action:
              name: route
              typed_config:
                '@type': type.googleapis.com/envoy.extensions.filters.udp.udp_proxy.v3.Route
                cluster: service_udp
        upstream_socket_config:
          max_rx_datagram_size: 9000
  clusters:
  - name: service_udp
    type: STATIC
    lb_policy: ROUND_ROBIN
    connect_timeout: 0.25s
    load_assignment:
      cluster_name: service_udp
      endpoints:
      - lb_endpoints:
        - endpoint:
            health_check_config:
              port_value: 64879
              address:
                socket_address:
                  address: 1.1.1.1
                  port_value: 64879
            address:
              socket_address:
                address: 1.1.1.1
                port_value: 64879
        - endpoint:
            health_check_config:
              port_value: 64879
              address:
                socket_address:
                  address: 2.2.2.2
                  port_value: 64879
            address:
              socket_address:
                address: 2.2.2.2
                port_value: 64879

64879 is a service port for Wireguard, which was started on 1.1.1.1 and 2.2.2.2 respectively. Now, stop the Wireguard service in 1.1.1.1 and use curl http://127.0.0.1:9901/cluster I see that the health status of this IP is still normal. May I ask how to solve this? How to perform health check on UDP 64879 port?

@taikai-zz taikai-zz added the triage Issue requires triage label Dec 27, 2024
@phlax phlax added area/udp area/health_checking area/upstream question Questions that are neither investigations, bugs, nor enhancements and removed triage Issue requires triage labels Dec 29, 2024
@phlax
Copy link
Member

phlax commented Dec 29, 2024

cc @zuercher @botengyao

@zuercher
Copy link
Member

You’ve specified the address and port to use when health checking is enabled, but haven’t enabled health checking.

Incidentally, you can leave off the health check config for the endpoints since it’s identical to the endpoint’s IP and port.

Health checks are configured using the health_checks field of the cluster (https://www.envoyproxy.io/docs/envoy/v1.32.3/api-v3/config/cluster/v3/cluster.proto#envoy-v3-api-msg-config-cluster-v3-cluster).

There are tcp, http and grpc health checkers by default, plus some additional custom health checks for redis and thrift. There’s no UDP equivalent of the tcp check because UDP is connection-less. I think the closest possibility would be an http health check with the cluster configured for http3, but I’ve never tried this.

@taikai-zz
Copy link
Author

@zuercher The current issue is that I have disabled one of the IP services, http://127.0.0.1:9901/cluster The service that crashed in the middle is still healthy, causing the client to still connect to the problematic machine. Is there a good solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/health_checking area/udp area/upstream question Questions that are neither investigations, bugs, nor enhancements
Projects
None yet
Development

No branches or pull requests

3 participants