Skip to content

Security Monitoring Stack For Your Kubernetes Cluster For Easy Monitoring and Visualize Security

Notifications You must be signed in to change notification settings

chabanyknikita/security-monitoring-template

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Security Monitoring Stack Kubernetes

The easiest way to install monitoring and monitor security of your kubernetes cluster.

This stack include:

  • Loki
  • Promtail
  • Grafana
  • Victoria Metrics Stack
  • Alertmanager
  • Kube-Bench Exporter
  • Falco Exporter
  • Trivy-Operator
Alerts From AlertManager
  • Some alerts:
    • Loki Alerts on Errors in Logs
    • Some default alerts
    • Kubernetes Node Not Ready
    • Kubernetes Memory Pressure
    • Kubernetes Disk Pressure
    • Kubernetes Network Unavailable
    • Kubernetes Out Of Capacity
    • Kubernetes Container Oom Killer
    • Kubernetes Job Failed
    • Kubernetes Cronjob Suspended
    • Kubernetes Persistentvolumeclaim Pending
    • Kubernetes Volume Out Of Disk Space
    • Kubernetes Volume Full In Four Days
    • Kubernetes Persistentvolume Error
    • Kubernetes Statefulset Down
    • Kubernetes Hpa Scaling Ability
    • Kubernetes Hpa Metric Availability
    • Kubernetes Hpa Scale Capability
    • Kubernetes Hpa Underutilized
    • Kubernetes Pod Not Healthy
    • Kubernetes Pod CrashLooping
    • Kubernetes Replicasset Mismatch
    • Kubernetes Deployment Replicas Mismatch
    • Kubernetes Statefulset Replicas Mismatch
    • Kubernetes Deployment Generation Mismatch
    • Kubernetes Statefulset Generation Mismatch
    • Kubernetes Statefulset Update Not RolledOut
    • Kubernetes Daemonset Rollout Stuck
    • Kubernetes Daemonset Misscheduled
    • Kubernetes Cronjob Too Long
    • Kubernetes Job Slow Completion
    • Kubernetes Api Server Errors
    • Kubernetes Api Client Errors
    • Kubernetes Client Certificate Expires Next Week
    • Kubernetes Client Certificate Expires Soon
    • Kubernetes Api Server Latency
    • Loki 5.. errors
    • Severity level - Error
    • Ledger Error

🌸 Setup

This is step-by-step how to install

Here are some details about install depends ..

Requirements:

2 CPU 4 GB RAM

Kubernetes Version which used in testing

v1.28.2

Pre-requirements

Clone repo

git clone https://github.com/chabanyknikita/security-monitoring-template.git
cd security-monitoring-template

Install helm repos for this stack

helm repo add jetstack https://charts.jetstack.io
helm repo add stable https://charts.helm.sh/stable
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm repo update

Firstly you can disable what you want in charts/monitoring/charts/SVC/values.yaml, example:

grafana:
  enabled: false


alertmanager:
  enabled: false

vmalert:
  enabled: false

Install nginx-Ingress and CertManger if you didn't install them

helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.8.0 \
--set installCRDs=true
helm upgrade --install ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx --create-namespace

Configure AlertManger:

Go to charts/monitoring/values.yaml and change in section victoria-metrics-k8s-stack.alertmanager.config this values on your:

chat_id: <Chat Id> "chat_id must be integer"
bot_token: <Bot Token> "bot_token must be string"
  • You can get this values from:
KEYS VALUES
TELEGRAM_ADMIN Your chat id you can get from (@userinfobot)
TELEGRAM_TOKEN Your telegram bot token you can get from (@botfather)

Only 1 user can see bot alerts

Configure Grafana Ingress(Optional):

If you need access to grafana via domain:

Go to charts/monitoring/values.yaml in section victoria-metrics-k8s-stack.grafana.ingress and

  • Change grafana.ingress.enabled to true
  • Change grafana.ingress.hosts to your domain
  • Change grafana.ingress.tls.hosts to your domain

Example:

  ingress:
    enabled: true
    annotations:
      certmanager.k8s.io/cluster-issuer: letsencrypt
      cert-manager.io/cluster-issuer: letsencrypt
      kubernetes.io/ingress.class: nginx
      kubernetes.io/tls-acme: "true"
    pathType: ImplementationSpecific
    hosts:
      - grafana.example.com
    tls: 
     - secretName: grafana-ingress-tls
       hosts:
         - grafana.example.com

Change namespace For Loki Alerts

  • Got to charts/monitoring/values.yaml in section loki-distributed.ruler.directories and change in all rules namespace on your, which you want to follow

Example:

              - alert: Error 5**
                expr: rate({namespace="stage", container!="horizon"} |~ "status=5.." | logfmt | label_format duration=duration,time=time,filename=filename,pid=pid,stream=stream,node_name=node_name,app=app,instance=instance[1m])>0
                for: 0m
                labels:
                  severity: error
                annotations:
                  summary: Error {{ $labels.status }} in {{ $labels.container }}

# Or you can follow more than 1 namespace:

              - alert: Error 5**
                expr: rate({namespace=~"monitoring|stage|prod"} |~ "status=5.." | logfmt | label_format duration=duration,time=time,filename=filename,pid=pid,stream=stream,node_name=node_name,app=app,instance=instance[1m])>0
                for: 0m
                labels:
                  severity: error
                annotations:
                  summary: Error {{ $labels.status }} in {{ $labels.container }}

Upgrade your nginx-ingress for collecting metrics

helm upgrade ingress-nginx ingress-nginx \
--repo https://kubernetes.github.io/ingress-nginx \
--namespace ingress-nginx \
--set controller.metrics.enabled=true \
--set-string controller.podAnnotations."prometheus\.io/scrape"="true" \
--set-string controller.podAnnotations."prometheus\.io/port"="10254"

Installation

Install NFS:

helm upgrade -i nfs-server stable/nfs-server-provisioner --set persistence.enabled=true,persistence.size=20Gi -n monitoring --create-namespace

Install CRD:

kubectl apply -f charts/monitoring/charts/crd/templates/crd.yaml

Install trivy operator

helm upgrade -i trivy-operator aqua/trivy-operator --namespace trivy-system --create-namespace --version 0.20.6 --values charts/trivy-operator/trivy-values.yaml

Install Falco And Falco-Exporter

helm upgrade -i falco --set falco.grpc.enabled=true --set falco.grpc_output.enabled=true --set driver.kind=ebpf falcosecurity/falco
helm upgrade -i falco-exporter falcosecurity/falco-exporter

Optionally Install Event-Generator for visualize how event's rules working

helm install event-generator falcosecurity/event-generator --namespace event-generator --create-namespace --set config.loop=false --set config.actions=""

Install monitoring stack

helm upgrade -i monitoring charts/monitoring --values charts/monitoring/values.yaml -n monitoring

Get grafana password:

kubectl get secret --namespace monitoring stack-grafana \
-ojsonpath="{.data.admin-password}" | base64 --decode ; echo

Access to grafana ui

  • Credentials:

    • login: admin
    • password: from previous step
  • If you enable ingress go to your domain and paste credentials

  • If you don't enable ingress do port-forwarding and go http://localhost:3000:

kubectl port-forward service/stack-grafana -n monitoring 3000:80

Now you have installed Monitoring Stack on your Kubernetes cluster!

About

Security Monitoring Stack For Your Kubernetes Cluster For Easy Monitoring and Visualize Security

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published