Skip to content

Commit

Permalink
GITBOOK-216: No subject
Browse files Browse the repository at this point in the history
  • Loading branch information
mouuii authored and gitbook-bot committed Jan 8, 2025
1 parent 284e4e1 commit 54b6be4
Showing 1 changed file with 114 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,120 @@

##  概括

该提案提出在 client-go clientsets 中添加 context 
该提案提出允许 k8s.io/client-go clientset 包发送请求的时候要传递 context

## 动机

使用 client-go 时,会对 Kubernetes API 进行外部调用以管理资源。沿着这些请求路径添加对上下文传播的支持可以实现横切功能的干净高效的实现。该功能最初将包括:
当我们使用 client-go 发送 API 请求访问集群内部资源时,添加 context 能够让请求干净有效的处理 cross cutting fuctionality 。 什么是 cross cutting functionality ,这里官方举了两个例子

* 支持请求超时和取消,以释放调用线程 ,大家可以看看这个pr:[https://github.com/kubernetes/kubernetes/pull/83064](https://github.com/kubernetes/kubernetes/pull/83064)
* 支持分布式跟踪

##  目标

* 允许 k8s.io/client-go 传递 context
* 清理不一致的 \*Option 选项

## 提议

推荐的方法是在对资源的操作方法上加 context 参数 。 我们将修改所有客户端接口的签名方法, `context.Context`作为第一个参数。这是惯用的、明确的,并且产生最不容易出错的 API。

## 设计细节

```go
func (c *pods) Get(ctx context.Context, name string, options metav1.GetOptions) (result *v1.Pod, err error) {
Result = &v1.Pod{}
err = c.client.Get().
Namespace(c.ns).
Resource("pods").
Name(name).
VersionedParams(&options, scheme.ParameterCodec).
Do(ctx).
Into(result)
return
}
```

在这里,我们扩展签名以添加`context.Context`作为第一个参数,然后将 context 附加到 request 上。

## 清理不一致的选项传递

这些是现有客户端 API 的签名:

```go
func (c *flunders) Get(string, metav1.GetOptions) (*v1beta1.Flunder, error)
func (c *flunders) List(metav1.ListOptions) (*v1beta1.FlunderList, error)
func (c *flunders) Watch(metav1.ListOptions) (watch.Interface, error)
func (c *flunders) Create(*v1beta1.Flunder) (*v1beta1.Flunder, error)
func (c *flunders) Update(*v1beta1.Flunder) (*v1beta1.Flunder, error)
func (c *flunders) UpdateStatus(*v1beta1.Flunder) (*v1beta1.Flunder, error)
func (c *flunders) Delete(string, *metav1.DeleteOptions) error
func (c *flunders) DeleteCollection(*metav1.DeleteOptions, metav1.ListOptions) error
func (c *flunders) Patch(string, types.PatchType, []byte, ...string) (*v1beta1.Flunder, error)
```

在完成所有建议的更改后,签名将如下所示:

```go
func (c *flunders) Get(context.Context, string, metav1.GetOptions) (*v1beta1.Flunder, error)
func (c *flunders) List(context.Context, metav1.ListOptions) (*v1beta1.FlunderList, error)
func (c *flunders) Watch(context.Context, metav1.ListOptions) (watch.Interface, error)
func (c *flunders) Create(context.Context, *v1beta1.Flunder, metav1.CreateOptions) (*v1beta1.Flunder, error)
func (c *flunders) Update(context.Context, *v1beta1.Flunder, metav1.UpdateOptions) (*v1beta1.Flunder, error)
func (c *flunders) UpdateStatus(context.Context, *v1beta1.Flunder, metav1.UpdateOptions) (*v1beta1.Flunder, error)
func (c *flunders) Delete(context.Context, string, metav1.DeleteOptions) error
func (c *flunders) DeleteCollection(context.Context, metav1.DeleteOptions, metav1.ListOptions) error
func (c *flunders) Patch(context.Context, string, types.PatchType, []byte, metav1.PatchOptions, ...string) (*v1beta1.Flunder, error)
```

## 风险和缓解措施

对于 Kubernetes 代码库之外的客户端集的使用者,我们将拍摄以下包的时间点快照。

```shell
k8s.io/apiextensions-apiserver/pkg/client/{clientset => deprecated}
k8s.io/client-go/{kubernetes => deprecated}
k8s.io/kube-aggregator/pkg/client/clientset_generated/{clientset => deprecated}
k8s.io/metrics/pkg/client/{clientset => deprecated}
```

这些快照将存在 2 个版本。这允许消费者最初重写导入(例如使用 sed)并逐步将代码迁移到新的 API。这为消费者提供了 4 个版本窗口(而不是标准的 2 个版本),以便在其 client-go 版本不再受到支持之前迁移到新的 API。

## 代码解析

首先我们看下当时遇到的问题描述:

Recently we found that kube-apiserver will leak(or we can call it backlog) goruntine in large-scale scene with lots of 504 response code.

We found that there is no context-aware method to cancel the admission and calling the webhooks. As we know, the kube-apiserver handler is wrapper with [timeout filter](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/filters/timeout.go) and put the logical process(in which apiserver admission and call webhooks) in another goruntine, then if the request timeout or the client close the request, the logical process goruntine will not quit until the admission and calling webhooks finished.

The webhook create context itself, and does not get the context from request. The context will never be cancled until the webhook server responses. And the timeout filter wrappered goruntine will not quit until the webhook request finished. Think about in large-scale sence, the webhook server performance will influence the apiserver goruntine backlog.

```go
func (a *Webhook) Dispatch(attr admission.Attributes, o admission.ObjectInterfaces) error {
if rules.IsWebhookConfigurationResource(attr) {
return nil
}
if !a.WaitForReady() {
return admission.NewForbidden(attr, fmt.Errorf("not yet ready to handle request"))
}
hooks := a.hookSource.Webhooks()
// TODO: Figure out if adding one second timeout make sense here.
ctx := context.TODO()

return a.dispatcher.Dispatch(ctx, attr, o, hooks)
}
```



解决该问题的 pr 主要有两个:

{% embed url="https://github.com/kubernetes/kubernetes/pull/81602/" %}

{% embed url="https://github.com/kubernetes/kubernetes/pull/81602/" %}

* Plumbs context to admission webhooks, uses lesser of context deadline or webhook-specific timeout when calling webhooks.
* Adds methods that take a context argument to `*Review` client interfaces
* Ensures context is available to logic that determines whether to retry requests for webhooks, so a canceled or timed-out context can short-circuit retries
* Ensures context is available to authorizers and authenticators that make remote calls so they can short-circuit the remote call if the context is canceled or times-out

0 comments on commit 54b6be4

Please sign in to comment.