title | summary | category | aliases | ||
---|---|---|---|---|---|
Configure a TiDB Cluster in Kubernetes |
Learn how to configure a TiDB cluster in Kubernetes. |
how-to |
|
This document introduces how to configure a TiDB cluster for production deployment. It covers the following content:
Before deploying a TiDB cluster, it is necessary to configure the resources for each component of the cluster depending on your needs. PD, TiKV and TiDB are the core service components of a TiDB cluster. In a production environment, you need to configure resources of these components according to their needs. For details, refer to Hardware Recommendations.
To ensure the proper scheduling and stable operation of the components of the TiDB cluster in Kubernetes, it is recommended to set Guaranteed-level quality of service (QoS) by making limits
equal to requests
when configuring resources. For details, refer to Configure Quality of Service for Pods.
If you are using a NUMA-based CPU, you need to enable Static
's CPU management policy on the node for better performance. In order to allow the TiDB cluster component to monopolize the corresponding CPU resources, the CPU quota must be an integer greater than or equal to 1
, apart from setting Guaranteed-level QoS as mentioned above. For details, refer to Control CPU Management Policies on the Node.
To configure a TiDB deployment, you need to configure the TiDBCluster
CR. Refer to the TidbCluster example for an example. For the complete configurations of TiDBCluster
CR, refer to API documentation.
Note:
It is recommended to organize configurations for a TiDB cluster under a directory of
cluster_name
and save it as${cluster_name}/tidb-cluster.yaml
. The modified configuration is not automatically applied to the TiDB cluster by default. The new configuration file is loaded only when the Pod restarts.
It is recommended that you set spec.configUpdateStrategy
to RollingUpdate
to enable automatic update of configurations. This way, every time the configuration is updated, all components are rolling updated automatically, and the modified configuration is applied to the cluster.
The cluster name can be configured by changing metadata.name
in the TiDBCuster
CR.
Usually, components in a cluster are in the same version. It is recommended to configure spec.<pd/tidb/tikv/pump/tiflash/ticdc>.baseImage
and spec.version
, if you need to configure different versions for different components, you can configure spec.<pd/tidb/tikv/pump/tiflash/ticdc>.version
.
Here are the formats of the parameters:
-
spec.version
: the format isimageTag
, such asv4.0.0
-
spec.<pd/tidb/tikv/pump/tiflash/ticdc>.baseImage
: the format isimageName
, such aspingcap/tidb
-
spec.<pd/tidb/tikv/pump/tiflash/ticdc>.version
: the format isimageTag
, such asv4.0.0
You can set the storage class by modifying storageClassName
of each component in ${cluster_name}/tidb-cluster.yaml
and ${cluster_name}/tidb-monitor.yaml
. For the storage classes supported by the Kubernetes cluster, check with your system administrator.
Different components of a TiDB cluster have different disk requirements. Before deploying a TiDB cluster, select the appropriate storage class for each component according to the storage classes supported by the current Kubernetes cluster and usage scenario.
For the production environment, local storage is recommended for TiKV. The actual local storage in Kubernetes clusters might be sorted by disk types, such as nvme-disks
and sas-disks
.
For demonstration environment or functional verification, you can use network storage, such as ebs
and nfs
.
Note:
If you set a storage class that does not exist in the TiDB cluster that you are creating, then the cluster creation goes to the Pending state. In this situation, you must destroy the TiDB cluster in Kubernetes.
The deployed cluster topology by default has 3 PD Pods, 3 TiKV Pods, and 2 TiDB Pods. In this deployment topology, the scheduler extender of TiDB Operator requires at least 3 nodes in the Kubernetes cluster to provide high availability. You can modify the replicas
configuration to change the number of pods for each component.
Note:
If the number of Kubernetes cluster nodes is less than 3, 1 PD Pod goes to the Pending state, and neither TiKV Pods nor TiDB Pods are created. When the number of nodes in the Kubernetes cluster is less than 3, to start the TiDB cluster, you can reduce both the number of PD Pods and the number of TiKV Pods in the default deployment to
1
.
If you want to enable TiFlash in the cluster, configure spec.pd.config.replication.enable-placement-rules
to true
and configure spec.tiflash
in the ${cluster_name}/tidb-cluster.yaml
file as follows:
pd:
config:
...
replication:
enable-placement-rules: "true"
...
tiflash:
baseImage: pingcap/tiflash
maxFailoverCount: 3
replicas: 1
storageClaims:
- resources:
requests:
storage: 100Gi
storageClassName: local-storage
TiFlash supports mounting multiple Persistent Volumes (PVs). If you want to configure multiple PVs for TiFlash, configure multiple resources
in tiflash.storageClaims
, each resources
with a separate storage request
and storageClassName
. For example:
tiflash:
baseImage: pingcap/tiflash
maxFailoverCount: 3
replicas: 1
storageClaims:
- resources:
requests:
storage: 100Gi
storageClassName: local-storage
- resources:
requests:
storage: 100Gi
storageClassName: local-storage
Warning:
Since TiDB Operator will mount PVs automatically in the order of the items in the
storageClaims
list, if you need to add more disks to TiFlash, make sure to append the new item only to the end of the original items, and DO NOT modify the order of the original items.
If you want to enable TiCDC in the cluster, you can add TiCDC spec to the TiDBCluster
CR. For example:
spec:
ticdc:
baseImage: pingcap/ticdc
replicas: 3
This section introduces how to configure the parameters of TiDB/TiKV/PD/TiFlash/TiCDC.
The current TiDB Operator v1.1 supports all parameters of TiDB v4.0.
TiDB parameters can be configured by spec.tidb.config
in TidbCluster Custom Resource.
For example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
....
tidb:
image: pingcap.com/tidb:v4.0.0
imagePullPolicy: IfNotPresent
replicas: 1
service:
type: ClusterIP
config:
split-table: true
oom-action: "log"
requests:
cpu: 1
For all the configurable parameters of TiDB, refer to TiDB Configuration File.
Note:
If you deploy your TiDB cluster using CR, make sure that
Config: {}
is set, no matter you want to modifyconfig
or not. Otherwise, TiDB components might not be started successfully. This step is meant to be compatible withHelm
deployment.
TiKV parameters can be configured by spec.tikv.config
in TidbCluster Custom Resource.
For example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
....
tikv:
image: pingcap.com/tikv:v4.0.0
config:
log-level: "info"
slow-log-threshold: "1s"
replicas: 1
requests:
cpu: 2
For all the configurable parameters of TiKV, refer to TiKV Configuration File.
Note:
If you deploy your TiDB cluster using CR, make sure that
Config: {}
is set, no matter you want to modifyconfig
or not. Otherwise, TiKV components might not be started successfully. This step is meant to be compatible withHelm
deployment.
PD parameters can be configured by spec.pd.config
in TidbCluster Custom Resource.
For example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
.....
pd:
image: pingcap.com/pd:v4.0.0
config:
lease: 3
enable-prevote: true
For all the configurable parameters of PD, refer to PD Configuration File.
Note:
If you deploy your TiDB cluster using CR, make sure that
Config: {}
is set, no matter you want to modifyconfig
or not. Otherwise, PD components might not be started successfully. This step is meant to be compatible withHelm
deployment.
TiFlash parameters can be configured by spec.tiflash.config
in TidbCluster Custom Resource.
For example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
...
tiflash:
config:
config:
logger:
count: 5
level: information
For all the configurable parameters of TiFlash, refer to TiFlash Configuration File.
You can configure TiCDC start parameters through spec.ticdc.config
in TidbCluster Custom Resource.
For example:
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
...
ticdc:
config:
timezone: UTC
gcTTL: 86400
logLevel: info
For all configurable start parameters of TiCDC, see TiCDC start parameters.
Note:
TiDB Operator provides a custom scheduler that guarantees TiDB service can tolerate host level failures through the specified scheduling algorithm. Currently, the TiDB cluster uses this scheduler as the default scheduler, which is configured through the item
spec.schedulerName
. This section focuses on configuring a TiDB cluster to tolerate failures at other levels such as rack, zone or region. This section is optional.
TiDB is a distributed database and its high availability must ensure that when any physical topology node fails, not only the service is unaffected, but also the data is complete and available. The two configurations of high availability are described separately as follows.
High availability at other levels (such as rack, zone, region) are guaranteed by Affinity's PodAntiAffinity
. PodAntiAffinity
can avoid the situation where different instances of the same component are deployed on the same physical topology node. In this way, disaster recovery is achieved. Detailed user guide for Affinity: Affinity & AntiAffinity.
The following is an example of a typical service high availability setup:
{{< copyable "" >}}
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
# this term works when the nodes have the label named region
- weight: 10
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: ${cluster_name}
app.kubernetes.io/component: "pd"
topologyKey: "region"
namespaces:
- ${namespace}
# this term works when the nodes have the label named zone
- weight: 20
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: ${cluster_name}
app.kubernetes.io/component: "pd"
topologyKey: "zone"
namespaces:
- ${namespace}
# this term works when the nodes have the label named rack
- weight: 40
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: ${cluster_name}
app.kubernetes.io/component: "pd"
topologyKey: "rack"
namespaces:
- ${namespace}
# this term works when the nodes have the label named kubernetes.io/hostname
- weight: 80
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/instance: ${cluster_name}
app.kubernetes.io/component: "pd"
topologyKey: "kubernetes.io/hostname"
namespaces:
- ${namespace}
Before configuring the high availability of data, read Information Configuration of the Cluster Typology which describes how high availability of TiDB cluster is implemented.
To add the data high availability feature in Kubernetes:
-
Set the label collection of topological location for PD
Replace the
location-labels
information in thepd.config
with the label collection that describes the topological location on the nodes in the Kubernetes cluster.Note:
- For PD versions < v3.0.9, the
/
in the label name is not supported. - If you configure
host
in thelocation-labels
, TiDB Operator will get the value from thekubernetes.io/hostname
in the node label.
- For PD versions < v3.0.9, the
-
Set the topological information of the Node where the TiKV node is located.
TiDB Operator automatically obtains the topological information of the Node for TiKV and calls the PD interface to set this information as the information of TiKV's store labels. Based on this topological information, the TiDB cluster schedules the replicas of the data.
If the Node of the current Kubernetes cluster does not have a label indicating the topological location, or if the existing label name of topology contains
/
, you can manually add a label to the Node by running the following command:{{< copyable "shell-regular" >}}
kubectl label node ${node_name} region=${region_name} zone=${zone_name} rack=${rack_name} kubernetes.io/hostname=${host_name}
In the command above,
region
,zone
,rack
, andkubernetes.io/hostname
are just examples. The name and number of the label to be added can be arbitrarily defined, as long as it conforms to the specification and is consistent with the labels set bylocation-labels
inpd.config
.