Skip to content

Conversation

haseebsyed12
Copy link
Contributor

@haseebsyed12 haseebsyed12 commented Sep 22, 2025

When service account details are created in Vault (PasswordSafe), a Kubernetes
secret is generated. Argo Events then triggers a Job that runs an Ansible
playbook to ensure the user is created in Nautobot and a corresponding token
is provisioned.

@haseebsyed12 haseebsyed12 changed the title nautobot token mgmt feat: nautobot token mgmt Sep 22, 2025
@haseebsyed12 haseebsyed12 force-pushed the global-secrets branch 2 times, most recently from 3949238 to dcb83db Compare September 22, 2025 13:03
@haseebsyed12 haseebsyed12 requested a review from a team September 22, 2025 13:35
@haseebsyed12 haseebsyed12 force-pushed the global-secrets branch 4 times, most recently from 8beb45f to 8fa7e24 Compare September 22, 2025 19:16
@haseebsyed12 haseebsyed12 changed the title feat: nautobot token mgmt feat: Automate Nautobot service account provisioning Sep 22, 2025
@haseebsyed12 haseebsyed12 requested a review from cardoe September 22, 2025 19:34
@haseebsyed12 haseebsyed12 marked this pull request as ready for review September 22, 2025 19:40
@haseebsyed12 haseebsyed12 requested a review from a team September 22, 2025 20:05
cardoe
cardoe previously requested changes Sep 23, 2025
@haseebsyed12 haseebsyed12 force-pushed the global-secrets branch 2 times, most recently from 50f6815 to 65edba4 Compare September 24, 2025 01:28
@haseebsyed12 haseebsyed12 requested a review from cardoe September 24, 2025 01:45
@abhimanyu003
Copy link
Contributor

This is looking good, but we need to document the flow using some flow chart + we need to have a table where all the required passwords are listed.

@haseebsyed12 haseebsyed12 force-pushed the global-secrets branch 7 times, most recently from 54871b8 to 1beab28 Compare September 25, 2025 14:42
Copy link
Collaborator

@skrobul skrobul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, please rebase

return None, f"Failed to fetch tokens: {e}"

data = response.json()
tokens = data.get("results", [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under what circumstance does the nautobot API produce 2xx responses with no "result" key in the body? Is it correct to interpret this as "query was successful and there are no tokens", or should this be treated as an error, indicating that the query could not be completed, and/or the results could not be parsed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for no tokens

{
    "count": 0,
    "next": null,
    "previous": null,
    "results": []
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that example the "results" key is present, with a value of the empty list. I was saying that if there is no "results" key at all in the dict, you treat this as the same thing, and I was wondering if that could mask errors from the API. However we are not worrying about errors, so you can ignore this.

@cardoe
Copy link
Contributor

cardoe commented Sep 29, 2025

So I'm still struggling to see the deployment configuration here. The questions I've got:

  • You'll either need to provide the list of service accounts / tokens that need to be defined in the environment that runs Nautobot (where an admin credential to create that will exist) OR if the individual site environments are defining their own then they'll need the admin credential. This isn't very clear to me from the code which path you intent to take.
  • You define a site specific Application deployment of a kustomization.yaml into a site namespace nautobot but to what purpose? The sites need the token credentials in the namespaces where their applications run.

Copy link
Contributor

@stevekeay stevekeay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like Doug says, if this is going to run in each site, the credential in passwordsafe would list the "central" nautobot URL, and every site will try to create the user and token in the central nautobot.

Should each site have its own username, like argo.iad3, argo.rxdb-lab, and so on?

return None, f"Failed to fetch tokens: {e}"

data = response.json()
tokens = data.get("results", [])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that example the "results" key is present, with a value of the empty list. I was saying that if there is no "results" key at all in the dict, you treat this as the same thing, and I was wondering if that could mask errors from the API. However we are not worrying about errors, so you can ignore this.

@haseebsyed12
Copy link
Contributor Author

Like Doug says, if this is going to run in each site, the credential in passwordsafe would list the "central" nautobot URL, and every site will try to create the user and token in the central nautobot.

Should each site have its own username, like argo.iad3, argo.rxdb-lab, and so on?

yes every site has its own set of secrets.

@haseebsyed12
Copy link
Contributor Author

haseebsyed12 commented Sep 30, 2025

So I'm still struggling to see the deployment configuration here. The questions I've got:

* You'll either need to provide the list of service accounts / tokens that need to be defined in the environment that runs Nautobot (where an admin credential to create that will exist) OR if the individual site environments are defining their own then they'll need the admin credential. This isn't very clear to me from the code which path you intent to take.

* You define a site specific Application deployment of a `kustomization.yaml` into a site namespace `nautobot` but to what purpose? The sites need the token credentials in the namespaces where their applications run.
  • We define superuser username/pwd and token for site in passwordsafe

  • responsibility of global cluster is to create superuser token of site clusters.

  • global cluster (staging) creates site cluster (rxdb-lab) secret of super-user and also creates user and token in nautobot.

  • Now site cluster only creates secret of super-user it did not do anything in nautobot.

  • Now by using site's superuser token. site creates all its other users in nautobot.

  • Nautobot store will provide specific application tokens to respective applications in there namespaces.

  • Global cluster's superuser token is not used anywhere in site cluster.

Copy link
Contributor

@stevekeay stevekeay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take one example: the secret that a site's argo-workflows is going to use, containing a token allowing it to log in to Nautobot. In which namespace will that be created?

How was that secret previously created? If it is managed by Argo CD, is there an old manifest that we need to delete?

@haseebsyed12
Copy link
Contributor Author

Let's take one example: the secret that a site's argo-workflows is going to use, containing a token allowing it to log in to Nautobot. In which namespace will that be created?
Answer: Secret will be created in nautobot namespace and then will be copied to argo-events workflow using nautobot ClusterSecretStore
> How was that secret previously created? If it is managed by Argo CD, is there an old manifest that we need to delete?
Answer: Previously we were using nautobot superuser token only everywhere.
Also I updated the manifest to use new workflow specific token instead of superuser token

When service account details are created in Vault (PasswordSafe), a Kubernetes
secret is generated. Argo Events then triggers a Job that runs an Ansible
playbook to ensure the user is created in Nautobot and a corresponding token
is provisioned.
@haseebsyed12 haseebsyed12 added this pull request to the merge queue Oct 7, 2025
Merged via the queue into main with commit 7014a25 Oct 7, 2025
30 checks passed
@haseebsyed12 haseebsyed12 deleted the global-secrets branch October 7, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants