Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Koperator crashes when a broker has no storage configurations set in KafkaCluster #1033

Open
2 tasks done
panyuenlau opened this issue Aug 10, 2023 · 1 comment
Open
2 tasks done
Labels
bug Something isn't working community help wanted Extra attention is needed triaged root-cause of the bug is known

Comments

@panyuenlau
Copy link
Member

Description

Koperator crashes when users don't have configurations set for any of the brokers under the KafkaCluster CR.

Expected Behavior

KafkaCluster handle the case when any of the brokers don't have storage configurations set.

Actual Behavior

Koperator crashes because of nil pointer dereference:

{"level":"info","ts":"2023-07-21T14:28:21.735Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"KafkaCluster","controllerGroup":"kafka.banzaicloud.io","controllerKind":"KafkaCluster","KafkaCluster":{"name":"test","namespace":"default"},"namespace":"default","name":"test","reconcileID":"c2ced75e-acaf-404d-aa53-ecab89a4f1d5"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x1804819]

goroutine 504 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119 +0x1fa
panic({0x1d09840, 0x3464fd0})
	/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/banzaicloud/koperator/pkg/resources/kafka.(*Reconciler).Reconcile(0xc00056b340, {{0x2375930?, 0xc0013f3cb0?}, 0x1c54020?})
	/workspace/pkg/resources/kafka/kafka.go:251 +0x2419
github.com/banzaicloud/koperator/controllers.(*KafkaClusterReconciler).Reconcile(0xc0003f20a0, {0x2370650, 0xc0013f3d40}, {{{0xc0007ecb20?, 0x10?}, {0xc0007ecb1c?, 0x40da67?}}})
	/workspace/controllers/kafkacluster_controller.go:126 +0x8e3
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x2370650?, {0x2370650?, 0xc0013f3d40?}, {{{0xc0007ecb20?, 0x1c58e60?}, {0xc0007ecb1c?, 0x0?}}})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0004521e0, {0x23705a8, 0xc0007341c0}, {0x1daf980?, 0xc000e26360?})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323 +0x3a5
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0004521e0, {0x23705a8, 0xc0007341c0})
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:231 +0x333

Affected Version

<= v0.25.1

Steps to Reproduce

  1. Intentionally not provide any storage configurations to any of the brokers in the cluster:
apiVersion: kafka.banzaicloud.io/v1beta1
kind: KafkaCluster
metadata:
  name: test
spec:
  ...
  brokers:
    - id: 0
    - id: 1
  ...
  1. Observe Koperator behavior.

Checklist

@panyuenlau panyuenlau added the bug Something isn't working label Aug 10, 2023
@panyuenlau
Copy link
Member Author

Root cause

Koperator expects the broker has the storage configurations set via either brokers[x].storageConfigs or brokers[x].brokerConfigGroup, and it just has a bad assumption that the users would have one of the configurations set

Potential Solutions

  1. When neither of the configuration is provided, Koperator gives a default storage configuration (with PVC) to the broker, e.g:
  • mountPath: "/kafka-logs"
    pvcSpec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 10Gi

Note: this might require us to start to start implementing a mutation webhook in Koperator

  1. Handle all the potential nil pointer accesses across the current implementation, and just start the broker with not storage configuration - by default Kafka uses  /tmp/kafka-logs as the log directory, and K8s uses local ephemeral storage for the pod.

Note: ephemeral storage is tied to the lifecycle of a pod, when a pod finishes or is restarted, the storage is cleared out

@panyuenlau panyuenlau added help wanted Extra attention is needed community triaged root-cause of the bug is known labels Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community help wanted Extra attention is needed triaged root-cause of the bug is known
Projects
None yet
Development

No branches or pull requests

1 participant