-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support remote clusters using API keys #8089
Conversation
Please do not spend some time on the case where node transport certificates are provided by a third-party tool (https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-transport-settings.html#k8s-transport-third-party-tools). This needs some adjustments, I'm working on it. |
// cluster is going to try to connect to the remote cluster service using the Service and each specific Pod IP. | ||
cfg[esv1.RemoteClusterPublishHost] = "${" + EnvRemoteClusterService + "}.${" + EnvNamespace + "}.svc" | ||
cfg[esv1.RemoteClusterBindHost] = "0.0.0.0" | ||
cfg[esv1.RemoteClusterPublishHost] = "${" + EnvPodName + "}.${" + HeadlessServiceName + "}.${" + EnvNamespace + "}.svc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default the published host is the Pod's IP address. While that IP address is automatically added in the ECK managed transport certificate it is not possible to include it when using the cert-manager (cert-manager/csi-driver#17).
That's why I decided to use the Pod hostname as available through the existing headless Service (so it can be resolved by other Pods).
It still has the downside that when using cert-manager CSI driver, the csi.cert-manager.io/dns-names
is now a bit involved, something along the lines of:
- name: transport-certs
csi:
driver: csi.cert-manager.io
readOnly: true
volumeAttributes:
csi.cert-manager.io/issuer-name: ca-cluster-issuer
csi.cert-manager.io/issuer-kind: ClusterIssuer
csi.cert-manager.io/dns-names: "${POD_NAME}.${POD_NAMESPACE}.svc.cluster.local,${POD_NAME}.<cluster-name>-es-<nodeset-name>.${POD_NAMESPACE}.svc,<cluster-name>-es-remote-cluster.${POD_NAMESPACE}.svc"
${POD_NAME}.${POD_NAMESPACE}.svc.cluster.local
is the existing, recommended DNS name, from our documentation (I guess it only works because ofverification_mode: certificate
in the transport configuration)${POD_NAME}.<cluster-name>-es-<nodeset-name>.${POD_NAMESPACE}.svc
is to match the published host.<cluster-name>-es-remote-cluster.${POD_NAMESPACE}.svc
is to match the remote cluster service.
(I think an alternative would be to try to use verification_mode: certificate
for the remote cluster server, this is something I wanted to avoid)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine
buildkite test this -f p=gke,t=TestRemoteClusterWithAPIKeys -m s=8.9.2,s=8.15.2,s=8.16.0-SNAPSHOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a bit of time today testing your PR and it is really impressive how well it worked. 🚀 I am not sure if an expectation mechanism on the secrets is need IIUC the worst case is we re-create an API key?
What I found slightly difficult to reason about is the reconciliation in the remote cluster controller. I know that this is to large parts existing code into which the new functionality was inserted. But maybe we can do something with the naming. IIUC the local cluster is always the remote server and I think this threw me off because in my head local meant the client that makes calls to a remote using an API key.
// cluster is going to try to connect to the remote cluster service using the Service and each specific Pod IP. | ||
cfg[esv1.RemoteClusterPublishHost] = "${" + EnvRemoteClusterService + "}.${" + EnvNamespace + "}.svc" | ||
cfg[esv1.RemoteClusterBindHost] = "0.0.0.0" | ||
cfg[esv1.RemoteClusterPublishHost] = "${" + EnvPodName + "}.${" + HeadlessServiceName + "}.${" + EnvNamespace + "}.svc" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine
} | ||
|
||
// Save the generated keys in the keystore. | ||
if err := clientClusterAPIKeyStore.Save(ctx, c, clientES); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we reconcile from the server's perspective which means if a cluster is client to multiple remotes multiple threads of reconciliation will try to update the same secret? Potentially at the same time depending on the parallelism the controller is configured with? Do you think this will be problem in practice? Did you consider doing it the other way round?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider doing it the other way round?
Yes, since we need an Elasticsearch client to reconcile the API keys it seemed more "natural" to me to create the ES client "once", and then reconcile/propagates the API keys for all the client clusters in the same loop, when the remote cluster is reconciled (as opposed to create n Elasticsearch clients for each of the remote clusters to reconcile the API keys). My feeling was that, in case of a conflict on the Secret
, the consequence would be that we would "just" need to create a new API key during the next iteration, and try to store it again, which seemed acceptable.
In any case one thing we may want to solve is when the API key is not immediately observed in the Secret (because of the client's cache). This would require a kind of "expectation mechanism": the generated keys would be stored in memory until they are persisted. This would prevent the conflict side effect when the Secret
is created/updated since we would no longer need to generate a new key because we still have it in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. It is probably computationally more expensive to create many ES clients with certificate pools than the current approach which minimises the number of clients created at the cost of potential conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One downside of the current approach that occurred to me during testing is that the reconciliation that optimises for creating fewer clients for the remote server clusters is more disruptive to the client clusters as every new remote cluster connection requires a full restart of the cluster to update the keystores. I don't think this a problem in practice, it I can only think of very rare scenarios where this might be observable (adding remote clusters within very short intervals but not at once)
I removed the draft status. I still want to work on the documentation and add some additional unit tests, but happy to get additional feedback in the meantime. |
// Check that the API is available | ||
esClient = newEsClient | ||
// Get all the API Keys, for that specific client, on the reconciled cluster. | ||
getCrossClusterAPIKeys, err := esClient.GetCrossClusterAPIKeys(ctx, "eck-*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to handle the case where a new cluster is not reachable yet to avoid noisy error logs like this:
manager.eck-operator Reconciler error {"service.version": "2.15.0-SNAPSHOT+65769109", "controller": "remotecluster-controller", "object": {"name":"cluster3","namespace":"ns2"}, "namespace": "ns2", "name": "cluster3", "reconcileID": "8aebea3f-118d-474e-b4c8-4b55efc26141", "error": "elasticsearch client failed for https://cluster3-es-default-0.cluster3-es-default.ns2:9200/_security/api_key?active_only=true&error_trace=true&name=eck-%2A: Get \"https://cluster3-es-default-0.cluster3-es-default.ns2:9200/_security/api_key?active_only=true&error_trace=true&name=eck-%2A\": EOF"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
/Users/pebrc/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
/Users/pebrc/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the case:
cloud-on-k8s/pkg/controller/remotecluster/controller.go
Lines 181 to 186 in 6576910
// Check if the ES API is available. We need it to create, update and invalidate | |
// API keys in this cluster. | |
if !services.NewElasticsearchURLProvider(*localEs, r.Client).HasEndpoints() { | |
log.Info("Elasticsearch API is not available yet") | |
return results.WithResult(defaultRequeue).Aggregate() | |
} |
Maybe there's a bug, this EOF
seems a bit odd though 🤔 , I would expect a connection refused or another error, EOF
feels like the connection was initiated but unexpectedly closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check if this error comes from our port-forwarder "hack" used in dev mode please, maybe by looking at the full stacktrace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that was it
}, | ||
Data: data, | ||
} | ||
if _, err := reconciler.ReconcileSecret(ctx, c, expected, owner); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was seeing a few noisy conflicts during testing I assume because Save
is called in two places during one reconciliation?
2024-10-20T10:29:43.425+0200 ERROR manager.eck-operator Reconciler error {"service.version": "2.15.0-SNAPSHOT+65769109", "controller": "remotecluster-controller", "object": {"name":"cluster1","namespace":"ns1"}, "namespace": "ns1", "name": "cluster1", "reconcileID": "316f64b0-454e-4285-8e89-34c18f0da6a2", "error": "Operation cannot be fulfilled on secrets \"cluster1-es-remote-api-keys\": the object has been modified; please apply your changes to the latest version and try again", "errorCauses": [{"error": "Operation cannot be fulfilled on secrets \"cluster1-es-remote-api-keys\": the object has been modified; please apply your changes to the latest version and try again"}]}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
/Users/pebrc/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
/Users/pebrc/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
/Users/pebrc/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:224
} | ||
|
||
// Save the generated keys in the keystore. | ||
if err := clientClusterAPIKeyStore.Save(ctx, c, clientES); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One downside of the current approach that occurred to me during testing is that the reconciliation that optimises for creating fewer clients for the remote server clusters is more disruptive to the client clusters as every new remote cluster connection requires a full restart of the cluster to update the keystores. I don't think this a problem in practice, it I can only think of very rare scenarios where this might be observable (adding remote clusters within very short intervals but not at once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! 🚢
Small aside: I noticed we don't watch rolebindings in the ES namespaces so when using the RBAC association restrictions we don't re-reconcile if those change. Not sure if there is an easy fix though because users could also use cluster role bindings. Also it has been like this for a very long time and was not introduced in your PR.
return response, err | ||
} | ||
|
||
func (c *clientV8) InvalidateCrossClusterAPIKey(ctx context.Context, name string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why this is not just called DeleteCrossClusterAPIKey
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the key is actually not deleted:
This API invalidates API keys created by the create API key or grant API key APIs. Invalidated API keys fail authentication, but they can still be viewed using the get API key information and query API key information APIs, for at least the configured retention period, until they are automatically deleted.
https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-invalidate-api-key.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL
buildkite test this -f p=gke,t=TestRemoteClusterWithAPIKeys -m s=8.9.2,s=8.15.2,s=8.16.0-SNAPSHOT,s=8.17.0-SNAPSHOT |
Sorry, just realized I forgot to reply to your comment. We are actually reconciling every 15 minutes in the current implementation:
// RequeueRbacCheck returns a reconcile result depending on the implementation of the AccessReviewer.
// It is mostly used when using the subjectAccessReviewer implementation in which case a next reconcile loop should be
// triggered later to keep the association in sync with the RBAC roles and bindings.
// See https://github.com/elastic/cloud-on-k8s/issues/2468#issuecomment-579157063
func RequeueRbacCheck(accessReviewer rbac.AccessReviewer) reconcile.Result {
switch accessReviewer.(type) {
case *rbac.SubjectAccessReviewer:
return reconcile.Result{RequeueAfter: 15 * time.Minute}
default:
return reconcile.Result{}
}
} |
I had completely forgotton about that. I did not wait long enough during testing obviously. |
This PRs allows to define API keys for remote cluster connections.
I'm creating this PR as a draft to get a first feedback on the overall architecture and how the API keys are managed.
The examples in this comment refer to the recipe available in
config/recipes/remoteclusters/elasticsearch.yaml
.No new controller,
➡️remoteca
remotecluster-controller
The API key reconciliation has been included in the existing
remoteca
controller that was taking care of the certificate management.It has been therefore renamed
remotecluster-controller
and is now also handling API keys reconciliation in both:API Key management
In the remote Elasticsearch cluster
API keys created in the remote cluster by the operator follow the following naming convention:
eck-<client_cluster_ns>-<client_cluster_name>-<remote_cluster_name>
.Some metadata, like the client cluster name or the hash of the expected key specification, are attached to the API keys in order to facilitate their conciliation:
In the client Elasticsearch cluster
Encoded keys used by the client cluster to authenticate against the remote one are stored in a new dedicated
Secret
on the client cluster (thisSecret
is handled using theAPIKeyStore
inpkg/controller/remotecluster/keystore.go
):> kubectl get secret -l common.k8s.elastic.co/type=remote-cluster-api-keys -A NAMESPACE NAME TYPE DATA AGE ns1 cluster1-es-remote-api-keys Opaque 1 13m
We are tracking the API key IDs in an annotation, this is used to detect when a key has been externally invalidated:
This Secret is:
Is it possible to go back to the "legacy" remote cluster mode once API keys are enabled?
Yes, you just have to delete the api key from the custom resource. However, this may result in downtime.
Todo
There are still a few things I'm not happy with. For example I'm wondering if we need an expectation mechanism for the Secrets that hold the encoded key: if a key is created and the Secret is created but not observed in the next reconciliation we may invalidate the key that has just been created.
Other things to complete before merging or going ga:
To be discussed in dedicated issues/prs
Testing
As previously mentioned there is an example in
config/recipes/remoteclusters/elasticsearch.yaml
.Here are a few API calls that you may find useful while testing this PR:
GET /_remote/info
to get the current remote cluster status. Note that theconnected
field may not be immedately refreshed/updated)GET /_security/api_key?active_only=true&name=eck-*
to get the active keys created by the operator.GET /to-ns2-cluster2:kibana_sample_data_ecommerce/_search
, when using the manifest in the recipe and run onns1/cluster1
this should return the data fromns2/cluster2
Fixes #7818