Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Allow configuration of scanning reports output #102

Closed
nnelas opened this issue Aug 28, 2023 · 7 comments · Fixed by #165
Closed

Feature Request: Allow configuration of scanning reports output #102

nnelas opened this issue Aug 28, 2023 · 7 comments · Fixed by #165

Comments

@nnelas
Copy link
Contributor

nnelas commented Aug 28, 2023

Is your feature request related to a problem?

Hallo! 👋

I'm currently analysing the hypothesis of using the Kubewarden stack as the policy manager for a multi-tenant cluster, and I'm struggling a bit with audit-scanner as it requires by default the instances of a (Cluster)PolicyReport for PolicyReportStore. From our experience with Gatekeeper, working with clusters with hundreds of namespaces, all these CRD instances led to increased pressure on apiserver and etcd nodes, which if possible would be something we would want to avoid.

Solution you'd like

As a result, I would like to understand if it would be possible to add a configuration that would define where would audit-scanner output/write a scanning report. To keep compatibility, it could keep writing to a (Cluster)PolicyReport by default. It could also output as JSON to the stdout (just like the --print flag that you currently have) or even write to a file on a volumeMount.

Alternatives you've considered

Looking at audit-scanner source-code, I couldn't find any alternative without changing its source code.

Anything else?

Please let me know if this makes sense and if it's aligned with the vision that you have for audit-scanner. I can also happily contribute with a PR, if this is something that you can't fit in your roadmap in the foreseeable future.

Thank you so much for all the hard work that you've put into this project! 🙌

@viccuad
Copy link
Member

viccuad commented Aug 31, 2023

Hi, thanks for opening this feature request!

Please let me know if this makes sense and if it's aligned with the vision that you have for audit-scanner.

We share the same concerns about instantiating PolicyReports WRT to etcd and apiserver in bigger clusters.

In fact, we have had this in mind since the audit-scanner feature inception, you can have a look at its RFC here, particularly:
https://github.com/kubewarden/rfc/blob/main/rfc/0012-policy-report.md#drawbacks
https://github.com/kubewarden/rfc/blob/main/rfc/0011-audit-checks.md#drawbacks

So that is good news :).

For developing and testing, we provide the PolicyReport as JSON in the logs with --print.
We already use a structured logging library that outputs in JSON. The problem is enforcing in audit-scanner that all output to stdout and stderr should go through it. If not, a badly formatted message may not be JSON formatted, hence the whole audit-scanner logs may not be JSON messages, and therefore an end user may need to parse the log contents. So far we have taken care that is not the case, but if we want to consume stdout programatically, we should ensure its format, possibly with a go linter.
This is why I'm a bit hesitant on exposing that --print flag as end-user consumable in the helm-charts, but I may be overthinking this.

Another consideration is that the log may disappear if the Pod is killed. This isn't the case for the default Cronjob as it is configured, as we have it configured to keep an amount of failed and succeeded runs. But for some users this could be a problem.

Other options we have considered is providing an HTTP endpoint that serves the results, and saving the results to an in-cluster database or messaging queue instead of etcd. The idea of writing to a file on a volumeMount is new to me, and is also another possible option. Personally, it feels a bit overkill to add a volumeMount just for that, though. I wonder what @kubewarden/kubewarden-developers think.

WRT to your hyphotesis on using Kubewarden,
do you already have other tools in mind to consume this JSON-formatted info? do you have a preferred way of consuming the JSON? (maybe you are already using a messaging queue?)
We would be happy to serve the JSON in a useful way.

@viccuad
Copy link
Member

viccuad commented Aug 31, 2023

The more I think about writing to a provided VolumeMount, the more I think it's the simpler way for now. The user provides the VM, and in audit-scanner we can ensure we only write the PolicyReport JSON once there has been a successful run, hence the VM always contains the last known state of the audit. One needs to consider making writing to the VM file concurrent safe in case there are several audit-scanner jobs, that started at different times. It may be enough with timestamping the beginning of the file contents.

@flavio
Copy link
Member

flavio commented Aug 31, 2023

Semi-OT: @nnelas we are glad you're looking into Kubewarden. Please don't hesitate to reach out to us. You can find us on Slack, we've also our monthly meeting on Zoom coming up. You can find all the links on our homepage.

As Victor said, we share your concerns about the performance impact of storing the policy reports into etcd. We want to offer alternatives to this approach in the next iteration of the audit-scanner. Your feedback would allow us to determine the direction to pursue.

Do you already have plans about how to leverage the json output? Do you plan to rely on some centralized logging solution to handle that?

@nnelas
Copy link
Contributor Author

nnelas commented Sep 4, 2023

Hallo @viccuad and @flavio ! 👋

Thank you so much for your comments, they are really insightful (and I'm glad that this is something that you already thought about)!

I've been discussing this internally with one of my co-workers (tagging @brunorene for visibility), and we think that there's a way to suit our needs without much of a hassle. To summarise, we would like to use audit-scanner to provide condensed cluster status with the help of metrics for Grafana dashboards and more detailed information available through logs with ElasticSearch. As such, we think that we can achieve that with 2 things: metrics and audit logs.

Given that the remaining Kubewarden projects are integrated with OpenTelemetry, perhaps we could leverage it to expose metrics at the namespace level, like policy-server already does with kubewarden_policy_evaluations_total.

Regarding logs, we were wondering if it would be possible to log each namespaces[].results[], as this would include relevant information for us (like the affected namespace, the non-compliant resource and the policy that it violated), which we would then use to index into ElasticSearch, which our tenants could use to query more information about their namespace.

Also, keeping in mind that these new developments could take some time to be implemented, would you be happy for us to contribute with a PR to disable (Cluster)PolicyReport per configuration?

Hope this gave you more context about our goals with audit-scanner. Please let us know if there's anything else we can help you with, both I and the remaining engineers from my team are happy to contribute and keen to see Kubewarden thrive!

Thank you! 🙌

@viccuad
Copy link
Member

viccuad commented Sep 5, 2023

Given that the remaining Kubewarden projects are integrated with OpenTelemetry, perhaps we could leverage it to expose metrics at the namespace level, like policy-server already does with kubewarden_policy_evaluations_total.

Yes, indeed. The audit-scanner exercises the policy-server as usual, the only difference is that it hits the audit/ endpoint instead of the validate/ endpoint. The difference between the endpoints is only in that the policy-server doesn't reject things in the cluster (as they don't exist, the request are fake, created by audit-scanner).

The policy-server is already instrumented when hitting the audit/ endpoint, see here, so you should be able to consume those.

@jvanz , just realized that the PR for the audit-scanner docs doesn't mention the new instrumentation, could you add a small paragraph there? (as the PR branch is on your personal repo)

Regarding logs, we were wondering if it would be possible to log each namespaces[].results[], as this would include relevant information for us (like the affected namespace, the non-compliant resource and the policy that it violated), which we would then use to index into ElasticSearch, which our tenants could use to query more information about their namespace.

Yes, that's definitely possible. Currently, with audit-scanner --print, we get a JSON. It has 1 ClusterWidePolicyReport, and 1 PolicyReport per namespace. Pretty printed looks like:

Click me
{
"cluster": {
  "metadata": {
    "name": "polr-clusterwide",
    "uid": "abc12691-8395-459c-ad5b-9411bae2cb86",
    "resourceVersion": "65555",
    "generation": 3,
    "creationTimestamp": "2023-09-01T08:39:11Z",
    "labels": {
      "app.kubernetes.io/managed-by": "kubewarden"
    },
    "managedFields": [
      {
        "manager": "audit-scanner",
        "operation": "Update",
        "apiVersion": "wgpolicyk8s.io/v1alpha2",
        "time": "2023-09-05T17:06:01Z",
        "fieldsType": "FieldsV1",
        "fieldsV1": {
          "f:metadata": {
            "f:labels": {
              ".": {},
              "f:app.kubernetes.io/managed-by": {}
            }
          },
          "f:results": {},
          "f:summary": {
            ".": {},
            "f:error": {},
            "f:fail": {},
            "f:pass": {},
            "f:skip": {},
            "f:warn": {}
          }
        }
      }
    ]
  },
  "summary": {
    "pass": 6,
    "fail": 1,
    "warn": 0,
    "error": 0,
    "skip": 0
  },
  "results": [
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "cert-manager",
          "uid": "6fa05d34-da6a-41be-9fd4-eb0cbb013842",
          "apiVersion": "v1",
          "resourceVersion": "290"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "default",
          "uid": "354f0f27-d4bd-49ff-8d4b-fd0544a40079",
          "apiVersion": "v1",
          "resourceVersion": "195"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "fail",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "demo2",
          "uid": "c38342fe-91f8-4f1e-b1ee-4fd783ec6ee4",
          "apiVersion": "v1",
          "resourceVersion": "65308"
        }
      ],
      "resourceSelector": {},
      "message": "The following labels are denied: cost-center",
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "kube-node-lease",
          "uid": "5d4c6ea8-6bef-4421-992b-72dd4d3080bc",
          "apiVersion": "v1",
          "resourceVersion": "43"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "kube-public",
          "uid": "03c68d14-43b6-4fdd-a993-1fdfd37d8226",
          "apiVersion": "v1",
          "resourceVersion": "38"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "kube-system",
          "uid": "a6834419-41d4-4070-b674-6a8cd41397e1",
          "apiVersion": "v1",
          "resourceVersion": "18"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    },
    {
      "source": "kubewarden",
      "policy": "cap-safe-labels",
      "rule": "safe-labels",
      "category": "Resource validation",
      "severity": "low",
      "timestamp": {
        "seconds": 1693933561,
        "nanos": 0
      },
      "result": "pass",
      "scored": true,
      "resources": [
        {
          "kind": "Namespace",
          "name": "kubewarden",
          "uid": "7ad50fa3-9337-43ac-ab37-bdab174219b9",
          "apiVersion": "v1",
          "resourceVersion": "596"
        }
      ],
      "resourceSelector": {},
      "properties": {
        "policy-resource-version": "65520",
        "policy-uid": "4d68903d-f0a7-4689-a2cf-ec63f277f986",
        "validating": "true"
      }
    }
  ]
},
"namespaces": [
  {
    "kind": "PolicyReport",
    "apiVersion": "wgpolicyk8s.io/v1alpha2",
    "metadata": {
      "name": "polr-ns-default",
      "namespace": "default",
      "uid": "233ba029-a882-41f0-b03f-5d7f5c131521",
      "resourceVersion": "65556",
      "generation": 2,
      "creationTimestamp": "2023-09-05T17:05:32Z",
      "labels": {
        "app.kubernetes.io/managed-by": "kubewarden"
      },
      "managedFields": [
        {
          "manager": "audit-scanner",
          "operation": "Update",
          "apiVersion": "wgpolicyk8s.io/v1alpha2",
          "time": "2023-09-05T17:06:01Z",
          "fieldsType": "FieldsV1",
          "fieldsV1": {
            "f:metadata": {
              "f:labels": {
                ".": {},
                "f:app.kubernetes.io/managed-by": {}
              }
            },
            "f:results": {},
            "f:scope": {
              ".": {},
              "f:name": {},
              "f:resourceVersion": {},
              "f:uid": {}
            },
            "f:summary": {
              ".": {},
              "f:error": {},
              "f:fail": {},
              "f:pass": {},
              "f:skip": {},
              "f:warn": {}
            }
          }
        }
      ]
    },
    "scope": {
      "name": "default",
      "uid": "354f0f27-d4bd-49ff-8d4b-fd0544a40079",
      "resourceVersion": "195"
    },
    "summary": {
      "pass": 5,
      "fail": 1,
      "warn": 0,
      "error": 0,
      "skip": 0
    },
    "results": [
      {
        "source": "kubewarden",
        "policy": "cap-do-not-run-as-root",
        "rule": "do-not-run-as-root",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "pass",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "properties": {
          "mutating": "true",
          "policy-resource-version": "65533",
          "policy-uid": "0fa3a93d-30b3-4b8e-9332-720956c0d6c1"
        }
      },
      {
        "source": "kubewarden",
        "policy": "cap-do-not-share-host-paths",
        "rule": "do-not-share-host-paths",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "pass",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "properties": {
          "policy-resource-version": "65546",
          "policy-uid": "8dff085b-1d92-4f2e-bbd6-7e0f11eae348",
          "validating": "true"
        }
      },
      {
        "source": "kubewarden",
        "policy": "cap-drop-capabilities",
        "rule": "drop-capabilities",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "pass",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "properties": {
          "mutating": "true",
          "policy-resource-version": "65514",
          "policy-uid": "079179b9-f1e4-451d-a9f7-7e255b5c00c0"
        }
      },
      {
        "source": "kubewarden",
        "policy": "cap-no-host-namespace-sharing",
        "rule": "no-host-namespace-sharing",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "pass",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "properties": {
          "policy-resource-version": "65523",
          "policy-uid": "154cd0e9-6d59-48c2-97f5-580a6a8634ed",
          "validating": "true"
        }
      },
      {
        "source": "kubewarden",
        "policy": "cap-no-privilege-escalation",
        "rule": "no-privilege-escalation",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "pass",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "properties": {
          "mutating": "true",
          "policy-resource-version": "65526",
          "policy-uid": "5ca390b5-7d51-4f54-86ea-7ca921b51851"
        }
      },
      {
        "source": "kubewarden",
        "policy": "cap-no-privileged-pod",
        "rule": "no-privileged-pod",
        "category": "PSP",
        "severity": "info",
        "timestamp": {
          "seconds": 1693933561,
          "nanos": 0
        },
        "result": "fail",
        "scored": true,
        "resources": [
          {
            "kind": "Pod",
            "namespace": "default",
            "name": "nginx-privileged",
            "uid": "4240a567-76cd-44b3-85de-41494b69556a",
            "apiVersion": "v1",
            "resourceVersion": "65408"
          }
        ],
        "resourceSelector": {},
        "message": "Privileged container is not allowed",
        "properties": {
          "policy-resource-version": "65530",
          "policy-uid": "b4f7b09d-c7a5-4424-9929-c34a3446220f",
          "validating": "true"
        }
      }
    ]
  }
]
}

It has $.cluster.results and $.namespaces[:1].results, and each namespace has a $.namespaces[:1].scope with a name of the namespace.
The specification of the JSON is for now report/report.go.
If that's enough, then with #103 and your PR to the helm-charts, you should be able to consume it.

Also, keeping in mind that these new developments could take some time to be implemented, would you be happy for us to contribute with a PR to disable (Cluster)PolicyReport per configuration?

Yes, definitely :). We already had in mind adding other stores in the future, such as an in-mem one, sqlite one, etc.

Opened #107 for it. It involves adding a store interface, and a new in-mem store.
Feel free to ask any questions there, and if you folks want to tackle it, or parts of it, just comment on it and we can assign it to you.

Hope this gave you more context about our goals with audit-scanner. Please let us know if there's anything else we can help you with, both I and the remaining engineers from my team are happy to contribute and keen to see Kubewarden thrive!

Thanks for the kind words! ❤️

@jvanz
Copy link
Member

jvanz commented Sep 6, 2023

@jvanz , just realized that the PR for the audit-scanner docs doesn't mention the new instrumentation, could you add a small paragraph there? (as the PR branch is on your personal repo)

Sure!

@jvanz
Copy link
Member

jvanz commented Sep 6, 2023

@jvanz , just realized that the PR for the audit-scanner docs doesn't mention the new instrumentation, could you add a small paragraph there? (as the PR branch is on your personal repo)

Done. kubewarden/docs#199 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants