From bb4132fbfa272d21fee607162f173f340d099d38 Mon Sep 17 00:00:00 2001 From: Amelia Downs Date: Fri, 1 Dec 2023 16:30:12 -0500 Subject: [PATCH 1/4] Update audit event names in readiness healthcheck RFC Rename app process audit events to better match the liveness healthcheck --- toc/rfc/rfc-0020-readiness-healthchecks.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/toc/rfc/rfc-0020-readiness-healthchecks.md b/toc/rfc/rfc-0020-readiness-healthchecks.md index 375a737d..67e9b1bf 100644 --- a/toc/rfc/rfc-0020-readiness-healthchecks.md +++ b/toc/rfc/rfc-0020-readiness-healthchecks.md @@ -157,11 +157,14 @@ to route pool". When AI readiness healthcheck fails a log line is printed to AI logs: "Container failed the readiness health check. Container marked not ready and removed from route pool". -#### App events +#### App Audit events -When AI readiness healthcheck succeeds a new application event is emitted: -"app.ready". When AI readiness healthcheck fails a new event is emitted: -"app.notready". +When the liveness healthchecks fail, it results in the following audit events: +`audit.app.process.crash` and `audit.app.process.rescheduling`. + +Similarly, when AI readiness healthcheck succeeds a new application event should be emitted: +`audit.app.process.ready`. And when AI readiness healthcheck fails a new event should be emitted: +`audit.app.process.notready`. ### Open Questions * What metrics would be helpful for app devs and operators? From 7c6441b888cb63fb2cd13581f1345833f842b93c Mon Sep 17 00:00:00 2001 From: Amelia Downs Date: Tue, 5 Dec 2023 07:07:18 -0500 Subject: [PATCH 2/4] Update toc/rfc/rfc-0020-readiness-healthchecks.md --- toc/rfc/rfc-0020-readiness-healthchecks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/toc/rfc/rfc-0020-readiness-healthchecks.md b/toc/rfc/rfc-0020-readiness-healthchecks.md index 67e9b1bf..2a56706f 100644 --- a/toc/rfc/rfc-0020-readiness-healthchecks.md +++ b/toc/rfc/rfc-0020-readiness-healthchecks.md @@ -157,7 +157,7 @@ to route pool". When AI readiness healthcheck fails a log line is printed to AI logs: "Container failed the readiness health check. Container marked not ready and removed from route pool". -#### App Audit events +#### App Audit Events When the liveness healthchecks fail, it results in the following audit events: `audit.app.process.crash` and `audit.app.process.rescheduling`. From 319e9ddcec62b9ff1cb63d552fa5e85fd24c8a58 Mon Sep 17 00:00:00 2001 From: Amelia Downs Date: Tue, 5 Dec 2023 07:10:08 -0500 Subject: [PATCH 3/4] Update toc/rfc/rfc-0020-readiness-healthchecks.md --- toc/rfc/rfc-0020-readiness-healthchecks.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/toc/rfc/rfc-0020-readiness-healthchecks.md b/toc/rfc/rfc-0020-readiness-healthchecks.md index 2a56706f..ae2f5622 100644 --- a/toc/rfc/rfc-0020-readiness-healthchecks.md +++ b/toc/rfc/rfc-0020-readiness-healthchecks.md @@ -162,9 +162,8 @@ and removed from route pool". When the liveness healthchecks fail, it results in the following audit events: `audit.app.process.crash` and `audit.app.process.rescheduling`. -Similarly, when AI readiness healthcheck succeeds a new application event should be emitted: -`audit.app.process.ready`. And when AI readiness healthcheck fails a new event should be emitted: -`audit.app.process.notready`. +Similarly, when an AI readiness healthcheck succeeds an `audit.app.process.ready` event should be emitted. +And when an AI readiness healthcheck fails an `audit.app.process.notready` event should be emitted. ### Open Questions * What metrics would be helpful for app devs and operators? From 2661b939608c313766b69c4d153fd4b7d1b3e590 Mon Sep 17 00:00:00 2001 From: Amelia Downs Date: Mon, 11 Dec 2023 16:24:02 -0500 Subject: [PATCH 4/4] Update toc/rfc/rfc-0020-readiness-healthchecks.md --- toc/rfc/rfc-0020-readiness-healthchecks.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/toc/rfc/rfc-0020-readiness-healthchecks.md b/toc/rfc/rfc-0020-readiness-healthchecks.md index ae2f5622..9d89dd4e 100644 --- a/toc/rfc/rfc-0020-readiness-healthchecks.md +++ b/toc/rfc/rfc-0020-readiness-healthchecks.md @@ -159,8 +159,7 @@ and removed from route pool". #### App Audit Events -When the liveness healthchecks fail, it results in the following audit events: -`audit.app.process.crash` and `audit.app.process.rescheduling`. +When a liveness healthcheck fails, it results in the `audit.app.process.crash` audit event. Similarly, when an AI readiness healthcheck succeeds an `audit.app.process.ready` event should be emitted. And when an AI readiness healthcheck fails an `audit.app.process.notready` event should be emitted.