AKRI forgets custom discovery handler registration when restarted #558

koepalex · 2023-02-20T11:24:20Z

Describe the bug
AKRI custom discovery handler, using gRPC call to register with AKRI core. If the AKRI core component is restarting (see #557) it seems to have lost the registration information. And the discovery handler are never called to discover.

Output of kubectl get pods,akrii,akric -o wide

 NAME                                              READY   STATUS    RESTARTS         AGE     IP            NODE                                NOMINATED NODE   READINESS GATES
pod/akri-agent-daemonset-4786l                    1/1     Running   68 (2d18h ago)   3d19h   10.244.1.31   aks-agentpool-28475719-vmss000000   <none>           <none>
pod/akri-agent-daemonset-d7kx5                    1/1     Running   93 (2d14h ago)   3d19h   10.244.0.27   aks-agentpool-28475719-vmss000001   <none>           <none>
pod/akri-controller-deployment-6cb9b9dcbb-d4c2s   1/1     Running   117 (3d3h ago)

Kubernetes Version: AKS 1.24.9

To Reproduce
Steps to reproduce the behavior:

Create cluster using AKS
Install Akri with the Helm command
Configure Akri to use custom discovery handler
Force restart of Akri agents

Expected behavior
Once AKRI is restarted, it should also restart the referenced discovery handlers and therefor let them reregister. As alternative AKRI could persist the registration information and reuse the discovery handler, once it is restarted.

Logs (please share snips of applicable logs)

agent.log

It seems that this line is unexpected:

[2023-02-20T11:18:39Z TRACE agent::util::discovery_operator] delete_offline_instances - entered for configuration Some("akri-opcua-asset")

The discovery handler is still up and running and waiting for discover call.

Additional context
n/a

The text was updated successfully, but these errors were encountered:

mregen · 2023-02-20T18:53:42Z

this PR might be related: #385

kate-goldenring · 2023-02-22T00:06:00Z

Hi @koepalex! it is expected that the discover handler will re-register with the agent if it's connection with the agent is dropped (such as due to the agent restarting). @mregen points to a good example of how we are doing this in Akri's Discovery Handlers.

kate-goldenring · 2023-03-07T17:48:16Z

@koepalex while the behavior of the agent is expected, I agree that it isn't ideal for instances to be deleted as a result of it. With a shared device like OPC UA ones, there should be a 5 minute grace period during which the discovery handler oculd re-register

github-actions · 2023-06-06T00:02:02Z

Issue has been automatically marked as stale due to inactivity for 90 days. Update the issue to remove label, otherwise it will be automatically closed.

rpieczon · 2023-08-30T10:11:18Z

Any idea when this one will be fixed?

github-actions · 2023-11-29T00:02:09Z

Issue has been automatically marked as stale due to inactivity for 90 days. Update the issue to remove label, otherwise it will be automatically closed.

koepalex added the bug Something isn't working label Feb 20, 2023

github-actions bot added the stale label Jun 6, 2023

kate-goldenring removed the stale label Jun 6, 2023

github-actions bot added the stale label Nov 29, 2023

kate-goldenring added keep-alive and removed stale labels Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AKRI forgets custom discovery handler registration when restarted #558

AKRI forgets custom discovery handler registration when restarted #558

koepalex commented Feb 20, 2023

mregen commented Feb 20, 2023

kate-goldenring commented Feb 22, 2023

kate-goldenring commented Mar 7, 2023

github-actions bot commented Jun 6, 2023

rpieczon commented Aug 30, 2023

github-actions bot commented Nov 29, 2023

AKRI forgets custom discovery handler registration when restarted #558

AKRI forgets custom discovery handler registration when restarted #558

Comments

koepalex commented Feb 20, 2023

mregen commented Feb 20, 2023

kate-goldenring commented Feb 22, 2023

kate-goldenring commented Mar 7, 2023

github-actions bot commented Jun 6, 2023

rpieczon commented Aug 30, 2023

github-actions bot commented Nov 29, 2023