[Enhancement][opensearch] Semantic health probes #474

lieberlois · 2023-09-25T14:36:57Z

Is your feature request related to a problem? Please describe.
Its frustrating that the OpenSearch API distinguishes between different cluster states (red/yellow/green) which makes liveness / readiness probes semantically useless during a rolling restart, since a second opensearch node is restarted, before all shards have been moved.

Describe the solution you'd like
I want an endpoint that only responds OK when the cluster health is green (or maybe yellow). One idea would be to add a small sidecar container as an adapter that interprets the response from GET _cluster/health which looks like this:

{
  "cluster_name": "opensearch-cluster",
  "status": "yellow",
  ...
}

I would suggest to implement a small Go service since it can be written in very few LoC and is very lightweight. Alternatively we could use a curl command along the lines of GET _cluster/health?wait_for_status=green&timeout=1 since it times out when the cluster state is not green and hence fails the readiness probe check. For both cases we somehow need to add a user that is allowed to call the endpoint (or use a dedicated client certificate which probably complicates the setup).

Describe alternatives you've considered
Using the Kubernetes Operator for OpenSearch is an option.

Additional context
I contributed #329 and #333 and can also assist in the implementation of this if others also think this is a good idea.

The text was updated successfully, but these errors were encountered:

smlx · 2023-09-27T06:36:21Z

There was an extensive discussion about this issue on this PR: #172

Please take a look and see if the discussion covers this issue.

lieberlois · 2023-09-28T08:56:33Z

Thanks @smlx, I read through the discussion :) I dont think using client cetificates was evaluated in the discussion. I would suggest to have a client cert, for whose CN / DN only the GET _cluster/health is allowed. This way we should be able to achieve a secure passwordless health check. What do you think?

smlx · 2023-09-29T00:42:15Z

One problem with using the _cluster/health endpoint for a readiness probe is that if the cluster goes unhealthy for some reason then Kubernetes will stop sending traffic to any node. This makes it very difficult to recover the cluster, since you can't hit the API through the service endpoint.

lieberlois · 2023-09-29T05:36:14Z

@smlx Good Point, but we can fix this by only using a startupProbe, which only runs on the first start of the pod 🤔 I see why its a tricky topic, its just that we sometimes get a red cluster state because of the rolling restart behaviour (e.g. 3 nodes, node 1 restarted and not fully initialized yet, then node 2 also reboots -> Cluster Red)

smlx · 2023-09-29T07:19:05Z

we can fix this by only using a startupProbe

But in that case imagine that enough pods go offline for the cluster to go red (e.g. nodes crash). How will the replacement pods rejoin the cluster, since the replacement pods will wait on the startup probe for the cluster to go green but it never will?

lieberlois · 2023-10-06T10:05:47Z

Mhm good point, i thought about using this startup probe in our project:

CERTS="/usr/share/opensearch/config/certs/admin/"
curl --fail-with-body --cert "$CERTS/tls.crt" --key "$CERTS/tls.key" -k "https://localhost:9200/_cluster/health?wait_for_status=green&timeout=1s"

How do you guys restart your cluster (also @smlx)? In our case, the cluster will at some point go red, because K8s doesnt wait for the cluster to be green again after restarting a node, it will just restart the next sts pod.

lieberlois · 2023-10-06T10:08:47Z

Another idea, we could also add a preStop lifecycle hook that either drains the OpenSearch node or makes sure that the cluster health is green before restarting it. What do you think of that? Basically i suggest to add sane defaults for this preStart hook (or at least an example on how to do it) here:

https://github.com/opensearch-project/helm-charts/blob/main/charts/opensearch/values.yaml#L409

Edit: an idea would be to just provide it as an example on how sb might implement this.

smlx · 2023-10-07T03:40:15Z

Yeah that might work, I like the idea. There also might be something we could do as a pre-upgrade hook in the helm chart, maybe in combination with a pod preStart or preStop hook

Honestly speaking, in the interest of a speedy upgrade we just incur a short window of the cluster going red.

lieberlois added enhancement New feature or request untriaged Issues that have not yet been triaged labels Sep 25, 2023

prudhvigodithi removed the untriaged Issues that have not yet been triaged label Oct 10, 2023

briend mentioned this issue Jul 18, 2024

[Enhancement][opensearch] Avoid rolling pods for chart updates unless necessary #557

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement][opensearch] Semantic health probes #474

[Enhancement][opensearch] Semantic health probes #474

lieberlois commented Sep 25, 2023 •

edited

Loading

smlx commented Sep 27, 2023

lieberlois commented Sep 28, 2023

smlx commented Sep 29, 2023

lieberlois commented Sep 29, 2023

smlx commented Sep 29, 2023

lieberlois commented Oct 6, 2023

lieberlois commented Oct 6, 2023 •

edited

Loading

smlx commented Oct 7, 2023

[Enhancement][opensearch] Semantic health probes #474

[Enhancement][opensearch] Semantic health probes #474

Comments

lieberlois commented Sep 25, 2023 • edited Loading

smlx commented Sep 27, 2023

lieberlois commented Sep 28, 2023

smlx commented Sep 29, 2023

lieberlois commented Sep 29, 2023

smlx commented Sep 29, 2023

lieberlois commented Oct 6, 2023

lieberlois commented Oct 6, 2023 • edited Loading

smlx commented Oct 7, 2023

lieberlois commented Sep 25, 2023 •

edited

Loading

lieberlois commented Oct 6, 2023 •

edited

Loading