forked from linuxkerneltravel/lmp
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request linuxkerneltravel#525 from ESWZY/traffic-mgr/doc
Update overall project documentation for TrafficManager
- Loading branch information
Showing
11 changed files
with
321 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# Concept | ||
|
||
In microservice system, during peak traffic periods, the entire microservice cluster experiences high load. Especially when the load surpasses the cluster's maximum capacity, it can lead to excessive load on specific nodes, rendering them unable to process subsequent requests. | ||
|
||
Furthermore, in a microservice system, there are cascading dependencies between service instances. If downstream service instance nodes fail, it can cause upstream service instances to enter a waiting or crashing state. Therefore, there is a need to proactively reject potentially failing requests to achieve quick failure. | ||
|
||
Currently, the process of fast request failure requires invasive methods like implementing sidecars or using iptables. However, these methods often suffer from performance and operability issues. Hence, there is a need to efficiently determine and redirect request traffic using Linux kernel's eBPF technology, ensuring that traffic is always directed towards available service instances. | ||
|
||
As a result, this project leverages eBPF technology in combination with microservice techniques to establish a non-intrusive mechanism for determining and redirecting microservice request traffic. Key capabilities include replacing a large number of iptables lookups and NATs for service requests, supporting fast weighted backend pod selection techniques (instead of random selection), facilitating kernel-level microservice canary testing, and dynamically adjusting the weight assigned to backend pod selection based on external performance metrics (enabling dynamic traffic allocation based on performance metric changes). | ||
|
||
## Overall Architecture | ||
|
||
The following is the overall architecture of the project: | ||
|
||
![Architecture](doc/img/architecture.svg) | ||
|
||
TrafficManager will collect data from multiple sources, including cluster metadata information from Kubernetes, availability and performance data from metric monitoring systems or AI Ops systems. After comprehensive analysis, it will distribute Pod handling and selection logic into kernel-mode Control Map and Data Map. Kernel-mode monitoring and operations begin after attaching the eBPF program. | ||
|
||
When Pods within the cluster initiate requests to specific services (e.g., `http://<svc>.<ns>.svc.cluster.local`), the eBPF program attached by TrafficManager intercepts the execution of connect(2) system call. After identifying, analyzing, rendering a verdict, and performing redirection, it completes user-transparent modifications of request, allowing redirection to a new target Pod. At this point, the request will smoothly traverse the overlay network and directly reach the target Pod on the target node (`http://<pod>.<svc>.<ns>.svc.cluster.local`). | ||
|
||
## Design and Implementation | ||
|
||
### Abstraction and Storage Design | ||
|
||
For eBPF, when we need to pass user-space data to kernel-space, we must utilize eBPF Map. However, eBPF Maps are mostly key-value pairs, which are not conducive to storing the complex information of Pods, Services, and other entities in their original form. Therefore, it is essential to consider the abstraction of these fields and their mapping relationships. As a result, we have divided this part into two maps based on their functions: Data Map and Control Map. | ||
|
||
#### Data Map | ||
|
||
The Data Map is solely used to store metadata for backend Pods and is indexed using unique identifiers. It serves as data storage. | ||
|
||
![Data Map](doc/img/data-map.svg) | ||
|
||
#### Control Map | ||
|
||
The Control Map is used to swiftly analyze the current cluster's operational status and select an appropriate result to modify the current request based on pre-defined action rules when a request is detected. In its design, it uses target IP, target port, and an index number for lookups. | ||
|
||
When the index number is 0, it is typically used to analyze the current service's status and necessitates a secondary lookup to select a backend Pod. Different behaviors correspond to different formats of the "Options" field to achieve several functionalities within this project. | ||
|
||
![Control Map](doc/img/control-map.svg) | ||
|
||
### Traffic Control Methods | ||
|
||
Based on the introduction above, we can discern the defined data structures. Here is an explanation of how these data structures are utilized. Please note that these usage methods may be changed or expanded as development progresses. | ||
|
||
![Traffic Control](doc/img/traffic-control.svg) | ||
|
||
For the standard backend selection method based on random lookup, we set the index number to 0 and perform a combined lookup in the Control Map using the target IP and port of the current request. This allows us to determine the number of backend Pods of this Service. For example, for Service 1, as illustrated above, there are two backend Pods. Selection is then done based on a 50% distribution, using a random Pod index as the index number for the lookup. After obtaining the Backend ID, we can look up the destination Pod's IP and port in Data Map. | ||
|
||
For the weight-based selection method (as seen in the above diagram for Service 2 - old), the initial steps are the same as random lookup selection, but there is an additional field indicating the selection probability (i.e., weight) of the current Pod. The eBPF program employs an O(log_2(n)) complexity algorithm to choose a suitable backend Pod. | ||
|
||
For services marked for traffic canary (as seen in the above diagram for Service 2 - new), there are additional fields to control the selection of the older version service. The selection process for other Pods is similar to the weight-based selection method. However, if the older version service is chosen as the destination for traffic based on weight, we retrieve relevant information for the older version service from Data Map and perform backend Pod selection through a **separate** lookup process (as shown in the diagram for Service 2 - old). | ||
|
||
### Dynamic Traffic Management | ||
|
||
With this project, we can achieve dynamic traffic management to address various cluster states. The diagram below outlines a dynamic traffic management approach based on load metrics (refer to [automatic_test.go](../acceptance/automatic/automatic_test.go)). | ||
|
||
![Dynamic Traffic](doc/img/dynamic-control.svg) | ||
|
||
After obtaining load metrics through monitoring tools like Node Exporter, cAdvisor, etc., the data is stored in Prometheus. These metrics will be used to assess the availability of the cluster, nodes, and individual Pods. The assessment can be based on traditional metric calculations or incorporate AI Ops techniques for more sophisticated evaluations. | ||
|
||
Once the availability of the cluster, nodes, and individual Pods has been calculated, TrafficManager will perform comprehensive assessments and design corresponding strategies for the service. These strategies may include traffic handling methods, identification of non-functioning Pods, and traffic allocation proportions for each Pod. | ||
|
||
Finally, this information is distributed to the kernel space through the associated maps and eBPF programs for request handling. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Install tutorial | ||
|
||
## Ubuntu 22.04 | ||
|
||
### Install Dependencies | ||
|
||
```bash | ||
# Install Go | ||
wget https://go.dev/dl/go1.20.5.linux-amd64.tar.gz | ||
rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.5.linux-amd64.tar.gz | ||
export PATH=$PATH:/usr/local/go/bin | ||
|
||
# Install Docker | ||
snap refresh | ||
snap install docker | ||
|
||
# Install and start local Kubernetes | ||
snap install kubectl --classic | ||
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 | ||
sudo install minikube-linux-amd64 /usr/local/bin/minikube | ||
minikube start --kubernetes-version=1.26.6 --force | ||
|
||
# Install eBPF development tools | ||
apt update -y | ||
apt install -y llvm clang make gcc | ||
apt install -y libbfd-dev libcap-dev libelf-dev | ||
git clone --recurse-submodules https://github.com/libbpf/bpftool.git | ||
make install -C bpftool/src/ | ||
cp bpftool/src/bpftool /usr/bin/ | ||
rm -rf bpftool/ | ||
``` | ||
|
||
### Apply Test Data | ||
|
||
```bash | ||
kubectl apply -f acceptance/testdata/k8s/ | ||
``` | ||
|
||
### Initialization | ||
|
||
```bash | ||
make init | ||
make | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,64 +1,52 @@ | ||
# eBPF Traffic Manager | ||
# Traffic Manager | ||
|
||
Based on the abstraction of Kubernetes Service and Pod, and the modification of network request events, this project can realize the following functions: | ||
[![Traffic Manager](https://github.com/linuxkerneltravel/lmp/actions/workflows/net_traffic_manager.yml/badge.svg)](https://github.com/linuxkerneltravel/lmp/actions/workflows/net_traffic_manager.yml) | ||
[![LICENSE](https://img.shields.io/github/license/linuxkerneltravel/lmp.svg?style=square)](https://github.com/linuxkerneltravel/lmp/blob/develop/LICENSE) | ||
|
||
1. Parse Service and redirect requests directly to backend Pods, avoiding the NAT of iptables. | ||
2. Filter out abnormal Pods to avoid requesting Pods that cannot work normally. If none of the pods are working, reject the request. | ||
3. Grayscale release: canary release and blue-green release. Provides cross-Service traffic modification capabilities. Select a specific part of the caller to call a specific version of the service to realize traffic migration or version upgrade. | ||
4. Support consistent hashing: use relevant fields (such as IP, port, protocol, etc.) for hash mapping to ensure that multiple requests from a specific source will be directed to a unique backend Pod. | ||
## Introduction | ||
|
||
## Install tutorial | ||
Traffic Manager is an eBPF-based traffic management tool. It leverages **non-intrusive, high-speed kernel programmable mechanism** to achieve cost-effective and dynamic microservice traffic orchestration. | ||
|
||
### Ubuntu 22.04 | ||
![Architecture](doc/img/architecture.svg) | ||
|
||
```bash | ||
# Install Go | ||
wget https://go.dev/dl/go1.20.5.linux-amd64.tar.gz | ||
rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.5.linux-amd64.tar.gz | ||
export PATH=$PATH:/usr/local/go/bin | ||
## Capabilities | ||
|
||
# Install Docker | ||
snap refresh | ||
snap install docker | ||
Based on abstractions of Kubernetes Services and Pods, as well as the modification of network request events, this project can achieve the following functionalities through refined operational logic: | ||
|
||
# Install and start local Kubernetes | ||
snap install kubectl --classic | ||
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 | ||
sudo install minikube-linux-amd64 /usr/local/bin/minikube | ||
minikube start --kubernetes-version=1.26.6 --force | ||
**Service Resolution**: It directs request of Service directly to backend Pods, bypassing massive iptables lookups and iptables NAT. | ||
|
||
# Install eBPF development tools | ||
apt update -y | ||
apt install -y llvm clang make gcc | ||
apt install -y libbfd-dev libcap-dev libelf-dev | ||
git clone --recurse-submodules https://github.com/libbpf/bpftool.git | ||
make install -C bpftool/src/ | ||
cp bpftool/src/bpftool /usr/bin/ | ||
rm -rf bpftool/ | ||
``` | ||
**Non-intrusive Traffic Management**: It offers the ability to modify traffic across Pods and Services. Callers can invoke particular versions of a service, facilitating traffic migration or version rolling upgrades. | ||
|
||
```bash | ||
kubectl apply -f acceptance/testdata/k8s/ | ||
``` | ||
**Metric-Based Traffic Management:** By using metric inputs, it filters and eliminates abnormal Pods, preventing requests from reaching malfunctioning Pods. If all Pods are unable to work correctly, the request is denied outright (as shown in the diagram below). | ||
|
||
```bash | ||
make init | ||
make | ||
``` | ||
![Dynamic Control](doc/img/dynamic-control.svg) | ||
|
||
## Usage | ||
## Getting Started | ||
|
||
Developing... | ||
For installation and initialization instructions, please refer to the documentation: [INSTALL.md](INSTALL.md). | ||
|
||
To get started, check out the introductory guide [here](doc/getting-started.md). | ||
|
||
## Documentation | ||
|
||
Conceptual documentation is here to provide an understanding of overall architecture and implementation details: [CONCEPT.md](CONCEPT.md). | ||
|
||
You can refer to some eBPF development documents at: [eBPF Development Tutorial](../sidecar/bpf/README.md#functional-bpf-programs). | ||
|
||
## Roadmap | ||
|
||
Project development plan: | ||
The roadmap provides an overview of the project's development plans and completion status. | ||
|
||
Detailed changelogs can be found here: [CHANGELOG.md](CHANGELOG.md). | ||
|
||
- [x] Build the basic development framework and automatic compilation pipeline. | ||
- [x] Implement kernel abstraction of Service and Pod, and design corresponding maps for storage and information transfer. | ||
- [ ] Implement cluster metadata analysis and map read and write update in user mode. Consider using the Kubernetes Controller's control loop to monitor changes to the current cluster and keep the metadata in the map always up to date. | ||
- [ ] Performance optimization and development framework arrangement. | ||
- [ ] Investigate and develop consistent hashing capabilities to achieve fast hashing and fast Pod selection. | ||
- [ ] Investigate and develop grayscale release function of traffic, such as canary release and blue-green release, which provides cross-Service traffic modification capabilities. | ||
- [ ] Implement filtering out specific abnormal nodes and Pods based on external cluster monitoring information. | ||
- [ ] Documentation and tutorials. | ||
- [x] Implement cluster metadata analysis and map read and write update in user mode. Consider using the Kubernetes Controller's control loop to monitor changes to the current cluster and keep the metadata in the map always up to date. | ||
- [x] Performance optimization and development framework arrangement. | ||
- [x] Investigate and develop grayscale release function of traffic, such as canary release and blue-green release, which provides cross-Service traffic modification capabilities. | ||
- [x] Implement filtering out specific abnormal nodes and Pods based on external cluster monitoring information. | ||
- [x] Performance optimization. | ||
- [x] Documentation and tutorials. | ||
- [ ] Access more monitoring data sources, guide TrafficManager to conduct traffic management through more complex indicators, and even AI Ops mechanisms. | ||
- [ ] Compress and reuse Map space, and minimize Map space through mechanisms such as `Union`. | ||
- [ ] Dynamically update the Map mechanism instead of updating by deleting and re-inserting. |
Oops, something went wrong.