Skip to content

Conversation

@tehbooom
Copy link
Member

@tehbooom tehbooom commented Aug 20, 2025

Resolves #8925

Elastic Package Registry (EPR) has been highly requested to be added to ECK.

EPR does not have any references since it does not require a license nor any other application.

The following was implemented for EPR

  • defaults to TLS
  • Sets the default container image to docker.elastic.co/package-registry/distribution
  • Users can set their own images
  • Users can update the config following the reference
  • Kibana can reference the EPR like Elasticsearch and Enterprise Search
  • If Kibana references EPR and TLS is enabled it will populate xpack.fleet.registryUrl and set the environment variable NODE_EXTRA_CA_CERTS to the path of EPR's CA which is mounted
  • If a user provides their own NODE_EXTRA_CA_CERTS with a mount the controller will combine the certs appending the EPR's CA to the users specified CA

This was tested with and without setting NODE_EXTRA_CA_CERTS using the below manifest

apiVersion: epr.k8s.elastic.co/v1alpha1
kind: ElasticPackageRegistry
metadata:
  name: registry
spec:
  version: 9.1.2
  count: 1
  podTemplate:
    spec:
      containers:
      - name: package-registry
        image: docker.elastic.co/package-registry/distribution:lite-9.1.2
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 9.1.2
  nodeSets:
  - name: default
    count: 1
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana
spec:
  version: 9.1.2
  count: 1
  elasticsearchRef:
    name: elasticsearch
  packageRegistryRef:
    name: registry
  config:
    telemetry.optIn: false
    xpack.fleet.isAirGapped: true
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.default.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.default.svc:8220"]
    xpack.fleet.packages:
      - name: system
        version: latest
      - name: elastic_agent
        version: latest
      - name: fleet_server
        version: latest
    xpack.fleet.agentPolicies:
      - name: Fleet Server on ECK policy
        id: eck-fleet-server
        namespace: default
        monitoring_enabled:
          - logs
          - metrics
        unenroll_timeout: 900
        package_policies:
        - name: fleet_server-1
          id: fleet_server-1
          package:
            name: fleet_server
  podTemplate:
    spec:
      containers:
      - name: kibana
        env:
        - name: NODE_EXTRA_CA_CERTS
          value: /custom/user/ca-bundle.crt
        volumeMounts:
        - name: custom-ca
          mountPath: /custom/user
          readOnly: true
      volumes:
      - name: custom-ca
        secret:
          secretName: user-custom-ca-secret
---
apiVersion: v1
kind: Secret
metadata:
  name: user-custom-ca-secret
  namespace: default
type: Opaque
data:
  ca-bundle.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZtVENDQTRHZ0F3SUJBZ0lVYjVrK2d6V3A5YjljWTV4bkhUcWZNdHFHUXIwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1hERUxNQWtHQTFVRUJoTUNXRmd4RlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRQpDZ3dUUkdWbVlYVnNkQ0JEYjIxd1lXNTVJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52Ck1CNFhEVEkxTURneU1ERTRNakl3T0ZvWERUTTFNRGd4T0RFNE1qSXdPRm93WERFTE1Ba0dBMVVFQmhNQ1dGZ3gKRlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRUNnd1RSR1ZtWVhWc2RDQkRiMjF3WVc1NQpJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52TUlJQ0lqQU5CZ2txaGtpRzl3MEJBUUVGCkFBT0NBZzhBTUlJQ0NnS0NBZ0VBMHljTGVySWR3LzdpbGlKMzVBUEZ4bUx6TFRnNWRhUStWSUttS2lNbStlTTYKanJOY3lnbGphNVFEbHYvMStGUm5hamhrRTBobHoycXEzTjk0U1pYN3M2eHBnQUVzMGVQQ3VaZVBNU2VUYlYyRgp0YlIxNnFuM0JjenVxN3laOXZwdHR3MmJRdkJkY3JzZFU4T2RYUWhGNFd4QUFwODRKYWlMNmkzMlA2K2VPODBwCmh3Z1kwS0F1bzZoZC8zaFpNME14M2MwRmJmU0JHaTUyOHZKODYzUDRXZlEwMWdtUUxVbGl0UlhhTUhiaDRXSm0KOU45c0psUXpnbkNuQjZ6YkZjZ2gweWxrakd0UzBIZEo3eSs3dmE0Q1BqdkxlWGpwTnZuQzRjTmlocnp4Wmw5bQphM0ZVdVpiU0lRekE2ZFlkdkdrT2V3OTJEek1BaTdldU14UDdyYVhRejZmc1N6U1V4N1RjQWl5M2E5VU9Fdi9rCk5NV3VTbDlUMHRRSkhJSzJMc0t0MlVKWVVHWk4wOWU2SUVSTlJOL0FIUjVDbTlhcVQ1Q2ZyQW9JVVhNdUg2S1oKN1JCZFFockRxL2xEQk54bWs5dW44V2lic0NSVnkvVXRJQ3lOSytxbGpGUWZEd01hNkRkd3BjcnpnTWZnU3RTawpLek1LRUJla2N0Q0Q4dHNmTjZYem5USmNBYUJETzFlQWZyT0Z2NG1PTXJqVG90OEYvK3pxN0dXNTlqWTRvdFhMCkY3TnpadFl0eWsvbDRvb2hUZUFuM1ptd1BDMGJFQ1FkTmpTVkZ6ZXJCamE4ZjhacGpKRzNjUllyVmh6YUNsRWMKRU5wbFRHcldVaUVwRDdnTnNlNWNDSnZpQU12NHdwait2QTVVNlA3Z0MxUUtKV2hWS3BVYWcvTmtTSUFCRmtrQwpBd0VBQWFOVE1GRXdIUVlEVlIwT0JCWUVGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUI4R0ExVWRJd1FZCk1CYUFGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3RFFZSktvWkkKaHZjTkFRRUxCUUFEZ2dJQkFEOFU3dm1yWmhHTUZiV2YzRDZlNy84TUwzWEhLRk5TNy9UeWF3U2tvdGVSTVdFbgp1RWhQK2dmbkdUT2ZITFlQeHl5eEJ4U041T29sZHRJclo5dnhBc2dlYWJzSkJaenhQVHpxU09VN3h3b09LcTlRCmdKRUYxL0ZmemFlR1V5dVE2S1ZaZ0QvZ1JPSW42Ri9OUGlzM1pvbUpPOStuVWdTTnNiUm9RYmdPUGdPV3Q3Z1gKVEhuOHJpdUp2OXRPNFBRN09Sa3pubDJYbERlcE9xNVpwSUtkcVl0Rm5MUjF3SllyREZESmt0Q3h6MzFob0FrZwpSVjlSU1BSMFFxZ1JQeFNpNGpXdkNGUk5XTUFJc0NadGJsWExRRUljWGI1YnlsWXV2a3psTTJ4dHlHK3FaRFhMCnFoZDVNeFZIUkpqTzE1VEdpZXFRcUpMVkZyVElhTHFoaXZpQ1pUbDJoVkYxVlpPVG05MU5aeE53M25RL3JyeDgKK2VQV2xTWlZKWXc3SDRkWkx5WTFjRUxLT0YrZDJybVNSZ2pWaHZycUZ3R1M3MUQzYkV4Y0dSakNrOHNQWEZyRwpsOFRzY05RMXBPSGVuNlJhOFhVdGtxU1doZllFb3owZjBEem4wYmt4c2VWaCttS1BHV3QxcHdlemVFTFVwaHE3CmwwSVRLeis1b1lqYWVHTDRia25kcWlpemwzWkc2N0lYL3VyR0dQVUxkLzU1NEtRMFFPMS92S3Y2dE1YMWc0dVMKWHdWc0pzQjlrTUIwRFFxbDhRYmg0UEJ2ZW9RRTZvL3BycXRtWjR1RWdDMCt1cm5paDlCY1FweFNKOUljR1kxTQpBQzRBcG5Pem1CYTFhUVBMcDRaRFIxQXpFK1hXWDd2WWNWYUxleUJxRzRja3dwbUtOUnhpcnJjS2NaMkYKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: fleet-server
spec:
  version: 9.1.2
  kibanaRef:
    name: kibana
  elasticsearchRefs:
  - name: elasticsearch
  mode: fleet
  fleetServerEnabled: true
  policyID: eck-fleet-server
  deployment:
    replicas: 1
    podTemplate:
      spec:
        serviceAccountName: fleet-server
        automountServiceAccountToken: true
        resources:
          requests:
            cpu: 200m
            memory: 1Gi
          limits:
            cpu: 1
            memory: 2Gi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fleet-server
  namespace: default
rules:
- apiGroups: [""]
  resources:
  - pods
  - namespaces
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs:
    - get
    - watch
    - list
- apiGroups: ["batch"]
  resources:
    - jobs
  verbs:
    - get
    - watch
    - list
- apiGroups: ["coordination.k8s.io"]
  resources:
  - leases
  verbs:
  - get
  - create
  - update
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fleet-server
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fleet-server
  namespace: default
subjects:
- kind: ServiceAccount
  name: fleet-server
  namespace: default
roleRef:
  kind: ClusterRole
  name: fleet-server
  apiGroup: rbac.authorization.k8s.io

@prodsecmachine
Copy link
Collaborator

prodsecmachine commented Aug 20, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link

github-actions bot commented Aug 20, 2025

🔍 Preview links for changed docs

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look from the side of the team maintaining Package Registry.

It looks great, thanks for adding support for package registry in ECK, this will help many users.

Added some comments, please let us know if you need a more in-depth review from our side.

@pebrc pebrc requested a review from Copilot August 25, 2025 15:17

This comment was marked as outdated.

@naemono naemono requested a review from Copilot August 25, 2025 18:39
@naemono naemono added >enhancement Enhancement of existing functionality discuss We need to figure this out labels Aug 25, 2025
@botelastic botelastic bot removed the triage label Aug 25, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Elastic Package Registry (EPR) support to ECK, introducing a new CRD for deploying EPR instances and enabling Kibana to reference EPR instances for Fleet package management.

  • Adds ElasticPackageRegistry CRD with controller to manage EPR deployments
  • Enables Kibana to associate with EPR instances via packageRegistryRef field
  • Implements TLS certificate handling and CA mounting for secure communication between Kibana and EPR

Reviewed Changes

Copilot reviewed 60 out of 61 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/apis/epr/v1alpha1/ New API definitions for ElasticPackageRegistry CRD
pkg/controller/packageregistry/ Controller implementation for managing EPR resources
pkg/controller/association/controller/kibana_epr.go Association controller for Kibana-EPR relationships
pkg/apis/kibana/v1/kibana_types.go Adds packageRegistryRef field and EPR association support
pkg/controller/kibana/ Updates Kibana controller to handle EPR associations and CA certificates
test/e2e/ E2E tests for EPR functionality and associations
Comments suppressed due to low confidence (1)

pkg/controller/kibana/pod_test.go:1

  • The comment on line 67 says 'readinessProbe is the readiness probe for the maps container' but this function is in the packageregistry controller and should refer to the package registry container.
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@barkbay barkbay added >feature Adds or discusses adding a feature to the product and removed >enhancement Enhancement of existing functionality labels Aug 26, 2025
@naemono

This comment was marked as resolved.

@naemono

This comment was marked as resolved.

@naemono

This comment was marked as resolved.

@naemono

This comment was marked as resolved.

@naemono

This comment was marked as resolved.

@naemono

This comment was marked as resolved.

@naemono naemono self-assigned this Dec 1, 2025
@naemono
Copy link
Contributor

naemono commented Dec 8, 2025

All of the issues running in Openshift/OCP-style clusters have been resolved, and verified. I'm waiting to verify UBI images specifically once they are built/pushed and this should be getting closer to a merging state.

The UBI images seem to run without issue:

                  openshift.io/scc: restricted-v2
                  packageregistry.k8s.elastic.co/config-hash: 2422330696
                  seccomp.security.alpha.kubernetes.io/pod: runtime/default
                  security.openshift.io/validated-scc-subject-type: user
Status:           Running
SeccompProfile:   RuntimeDefault
IP:               10.129.2.98
IPs:
  IP:           10.129.2.98
Controlled By:  ReplicaSet/registry-epr-858f669ff
Containers:
  package-registry:
    Container ID:    cri-o://a1e3ce5cf092d7b636a9d24b08ef6bd2d93e45685dcb3d01b4a6bf872a51db79
    Image:           docker.elastic.co/package-registry/distribution:lite-ubi

I think the one final change is to ensure that in an ocp environment we are using the ubi images by default. This seem to differ from the standard stack images which are UBI by default from 9.x forward. I'll make the changes and verify.

The suffix should handle this when --ubi-only is set. I believe this is how we normally handle this in other controllers.

https://github.com/elastic/cloud-on-k8s/pull/8800/files#diff-52e0749d4ea9659ff8934fe1491cc88fc5508988f026b1ca8a0704e3a75da924R107-R111

@naemono naemono requested review from barkbay and pebrc December 8, 2025 02:43
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not manage to finish my review today. Wanted to leave some first feedback regardless. My first test on OCP GKE with the regular lite image failed btw see below.

Also it would probably good to resolve the comment threads that have been addressed. It is otherwise really hard to review with 126 comments on this PR.

},
Privileged: ptr.To(false),
ReadOnlyRootFilesystem: ptr.To(true),
RunAsNonRoot: ptr.To(true),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error: container has runAsNonRoot and image will run as root (pod: "registry-epr-95b44664b-8rx5p_default(8d9e317f-1201-4a6b-8087-81b2e7d5e3cb)", container: package-registry)

docker image inspect ...

 "Config": {
            "User": "0",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Openshift 4.20:

❯ kc version
Client Version: v1.32.0
Kustomize Version: v5.5.0
Server Version: v1.33.5

❯ oc version
Client Version: 4.19.12
Kustomize Version: v5.5.0
Server Version: 4.20.1
Kubernetes Version: v1.33.5

❯ kc get pod -n elastic
NAME                            READY   STATUS    RESTARTS   AGE
elasticsearch-es-default-0      1/1     Running   0          5m32s
kibana-kb-688cc567dc-djs4g      1/1     Running   0          5m30s
registry-epr-798dcbb6d8-vvk5m   1/1     Running   0          5m35s

❯ kc get pod -n elastic registry-epr-798dcbb6d8-vvk5m -o yaml | yq '.spec.containers[].image'
docker.elastic.co/package-registry/distribution:lite-9.2.0

❯ docker images | grep package
docker.elastic.co/package-registry/distribution:lite-9.2.0                                    be2c7fb983b1       13.7GB             0B
docker.elastic.co/package-registry/distribution:lite-ubi                                      ec9a9bd7c594       14.1GB             0B

# so the non-UBI image is root:
❯ docker inspect be2c7fb983b1 | jq '.[].Config' | grep User
  "User": "0",

# the UBI image is not root:
❯ docker inspect ec9a9bd7c594 | jq '.[].Config' | grep User
(none)

Annotations on running pod:

   seccomp.security.alpha.kubernetes.io/pod: runtime/default
    security.openshift.io/validated-scc-subject-type: user

Security Context:
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsNonRoot: true
      runAsUser: 1000730000
      seccompProfile:
        type: RuntimeDefault

Why am I not running into the same issue as you @pebrc ? I'm clearly missing something.

Copy link
Contributor

@pkoutsovasilis pkoutsovasilis Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking out loud here; you are not running in the same issue because in ocp a random UID non-zero is set by default (runAsUser: 1000730000). On the contrary, in k8s where I and @pebrc faced this error, the UID uses the image's one which is 0, if you don't specify runAsUser?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I replicated this in a gke environment, not an openshift environment (which I thought openshift was the more restrictive, hence my testing there to replicate this)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies. I must have mixed up kubecontexts.

return "", false, err
}
eprVersionResponse := &eprVersionResponse{}
ver, isServerless, err := info.Version("/api/epr/v1/internal/version", eprVersionResponse)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How stable is that internal API? Is there no public API we can use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some extra thoughts on that; AFAICT this path /api/epr/v1/internal/version doesn't exist at the elastic package registry service, e.g. here. It reminds me of a kibana service path but even there I get a 404. Also, do we rely support an external EPR for Kibana? AFAIK EPR doesn't need a username password and association.GetUnmanagedAssociationConnectionInfoFromSecret will look for these

continue
}
if assoc.AssociationType() == commonv1.PackageRegistryAssociationType {
// EPR is version-agnostic, skip version compatibility check
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really true? Can we run any EPR with any Kibana version? I have doubts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this indeed true, but since docker.elastic.co/package-registry/distribution image publishes tags following the stack version I would say to stick to how we handle versions of other stack components

Signed-off-by: Michael Montgomery <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discuss We need to figure this out >feature Adds or discusses adding a feature to the product release-highlight Candidate for the ECK release highlight summary

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add package registry (EPR) to ECK

8 participants