Skip to content

KRaft Support

Darren Lau edited this page Aug 8, 2023 · 4 revisions

KRaft Support Discussion

Context

We aim to introduce KRaft support to Koperator to accommodate the significant changes that KRaft brings to Kafka. Notable differences emerge between Kafka in the KRaft world and Kafka in the ZooKeeper world:

What Why Impact on Koperator
There are three potential roles that a Kafka node can: broker, controller, or both (e.g. broker+controller, but this is not recommended for production per Kafka's suggestion) The controller processes are to replace the ZooKeeper nodes to manage the cluster metadata "broker" is no longer generic enough to represent any Kafka nodes in the Kafka cluster, the KafkaCluster will need to be updated to reflect this fact.
The DescribeClusterRequest API no longer exposes the active controller (in fact, any of the controller nodes), source code reference: click me. Kafka tries to isolate controller access from the admin client in the KRaft world. Old admin clients who send requests directly to the controller will be given a random broker id, and the reply on the random broker to forward the original requests. The determineControllerId is essentially deprecated in KRaft world, and therefore the reorderBrokers logic can no longer take the controller ID into consideration.
In fact, in KRaft world, re-electing active controller is not as expensive as it was in ZooKeeper world, because the non-active controllers will try to have the up-to-date metadata stored in memory and in disk (these are call "hot stand-by" controllers ).

Problem Statement - 1

How should the KafkaCluster API be updated to reflect the changes in KRaft world?

Options

Option 1

Retain the existing Broker struct and introduce Controllers and CombinedNodes structs within KafkaClusterSpec:

// other fields are intentionally omitted here
type KafkaClusterSpec struct {
    // ControllerMode specifies the Kafka cluster in either ZooKeeper or KRaft mode.
    // +kubebuilder:validation:Enum=kraft;zookeeper
    // +optional
    ControllerMode              ControllerMode          `json:"controllerMode,omitempty"`
    Brokers                     []Broker                `json:"brokers"`
    Controllers                 []Controller            `json:"controllers,omitempty"`
    CombinedNodes               []CombinedNode          `json:"combinedNodes,omitempty"`
}

// Controller defines basic configurations for controllers (in KRaft)
type Controller struct {
    Id               int32             `json:"id"`
    ReadOnlyConfig   string            `json:"readOnlyConfig,omitempty"`
    ControllerConfig *ControllerConfig `json:"controllerConfig"`
}

type ControllerConfig struct {
    // Use the existing BrokerConfig as a blueprint to add/remove corresponding fields from the BrokerConfig
    // reference of BrokerConfig: https://github.com/banzaicloud/koperator/blob/master/api/v1beta1/kafkacluster_types.go#L19
}

// Note: need to find a way to merge the BrokerConfig and ControllerConfig nicely
type CombinedNode struct {
    Id                       int32                     `json:"id"`
    ReadOnlyConfig           string                    `json:"readOnlyConfig,omitempty"`
    BrokerConfig             *BrokerConfig             `json:"brokerConfig,omitempty"`
    ControllerConfig         *ControllerConfig         `json:"controllerConfig,omitempty"`
}

Option 2

Extract the common configurations that are applicable to both broker and controller from current BrokerConfig. And allow users to treat one of the broker node to be the combined role (mainly for development usage):

// other fields are intentionally omitted here
type KafkaClusterSpec struct {
    // ControllerMode specifies the Kafka cluster in either ZooKeeper or KRaft mode.
    // +kubebuilder:validation:Enum=kraft;zookeeper
    // +optional
    ControllerMode              ControllerMode          `json:"controllerMode,omitempty"`
    Brokers                     []Broker                `json:"brokers"`
    Controllers                 []Controller            `json:"controllers,omitempty"`
}

type Broker struct {
    Id                int32         `json:"id"`
    BrokerConfigGroup string        `json:"brokerConfigGroup,omitempty"`
    ReadOnlyConfig    string        `json:"readOnlyConfig,omitempty"`
    BrokerConfig      *BrokerConfig `json:"brokerConfig,omitempty"`
    // CombinedNode indicates if this broker node is a combined (broker + controller) node in KRaft mode. If set to true,
    // Koperator assumes the ReadOnlyConfig would include the read-only configurations for both the controller and broker processes.
    // This is default to false; if set to true in ZooKeeper mode, Koperator will ignore this configuration.
    // +optional
    CombinedNode bool `json:"combinedNode,omitempty"`
}

// BrokerConfig defines the broker configurations
type BrokerConfig struct {
    CommonConfig         `json:",inline"`
    BrokerSpecificConfig `json:",inline"`
}

// BrokerSpecificConfig defines the configurations that are only applicable to brokers
type BrokerSpecificConfig struct {
    BrokerIngressMapping []string `json:"brokerIngressMapping,omitempty"`
    Config string `json:"config,omitempty"`
    MetricsReporterImage string `json:"metricsReporterImage,omitempty"`
    NetworkConfig *NetworkConfig `json:"networkConfig,omitempty"`
    NodePortExternalIP map[string]string `json:"nodePortExternalIP,omitempty"`
    NodePortNodeAddressType corev1.NodeAddressType `json:"nodePortNodeAddressType,omitempty"`
}

// Controller represents "controller" nodes in KRaft. This is not applicable to ZooKeeper mode
type Controller struct {
    Id               int32             `json:"id"`
    ReadOnlyConfig   string            `json:"readOnlyConfig,omitempty"`
    ControllerConfig *ControllerConfig `json:"controllerConfig,omitempty"`
}

// ControllerConfig defines the controller configurations in KRaft. This section is ignored in ZooKeeper-mode.
type ControllerConfig struct {
    CommonConfig             `json:",inline"`
    ControllerSpecificConfig `json:",inline"`
}

// ControllerSpecificConfig defines the controller-specific configurations in KRaft
type ControllerSpecificConfig struct {
}

// CommonConfig holds the common configurations that are applicable to both the "brokers" and "controllers" (in KRaft term)
// In ZooKeeper-mode, this is just a subset of the old BrokerConfig
type CommonConfig struct {
    Affinity *corev1.Affinity `json:"affinity,omitempty"`
    Annotations map[string]string `json:"annotations,omitempty"`
    Containers []corev1.Container `json:"containers,omitempty"`
    Envs []corev1.EnvVar `json:"envs,omitempty"`
    Image string `json:"image,omitempty"`
    ImagePullSecrets []corev1.LocalObjectReference `json:"imagePullSecrets,omitempty"`
    InitContainers []corev1.Container `json:"initContainers,omitempty"`
    KafkaHeapOpts string `json:"kafkaHeapOpts,omitempty"`
    KafkaJVMPerfOpts string `json:"kafkaJvmPerfOpts,omitempty"`
    Labels map[string]string `json:"labels,omitempty"`
    Log4jConfig string `json:"log4jConfig,omitempty"`
    NodeSelector map[string]string `json:"nodeSelector,omitempty"`
    PodSecurityContext *corev1.PodSecurityContext `json:"podSecurityContext,omitempty"`
    PriorityClassName string `json:"priorityClassName,omitempty"`
    Resources *corev1.ResourceRequirements `json:"resourceRequirements,omitempty"`
    ServiceAccountName string `json:"serviceAccountName,omitempty"`
    SecurityContext *corev1.SecurityContext `json:"securityContext,omitempty"`
    StorageConfigs []StorageConfig `json:"storageConfigs,omitempty"`
    TerminationGracePeriod *int64 `json:"terminationGracePeriodSeconds,omitempty"`
    Tolerations []corev1.Toleration `json:"tolerations,omitempty"`
    VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"`
    Volumes []corev1.Volume `json:"volumes,omitempty"`
}

Decision Outcome

Option 2 was chosen. By utilizing CommonConfig as the foundation, we extend the existing implementation to cover both broker and controller nodes, enabling seamless configuration enhancements. A Venn diagram visually demonstrates the relationship between broker and controller configurations:

venn diagram

Problem Statement - 2

How should Koperator deploy and manage the controller nodes?

Options

Option 1: StatefulSet

Controllers as a StatefulSet and brokers/combined nodes as Pods.

Option 2: Pods

All controllers and combined nodes managed as Pods, mirroring the broker node approach. Controller nodes excluded from certain Cruise Control actions, aligning with their restricted access design.

Pros and Cons of the Options

Pros Cons
Option 1 - Simplified management of controller nodes, mirroring ZooKeeper's role transition.
- Low complexity due to leveraging K8s' StatefulSet controller.
- Uniform controller configuration, individual customization not supported.
- Inability to selectively remove specific controller nodes. The StatefulSet controller prioritizes removal of the most recently created pod.
Option 2 - Flexibility for individual controller node configuration.
- Consistency with existing broker pod management model.
- Increased overall complexity.
- Challenges arise from differing nature of controller and broker nodes. Cruise Control actions that target brokers do not apply to controllers. Possibility of confusion for developers and potentially inelegant implementations.

Decision Outcome

TBD