Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client connection not authorized #24600

Open
zhangluva opened this issue Dec 17, 2024 · 0 comments
Open

Client connection not authorized #24600

zhangluva opened this issue Dec 17, 2024 · 0 comments
Labels
kind/bug Something isn't working

Comments

@zhangluva
Copy link

zhangluva commented Dec 17, 2024

Version & Environment

Redpanda version: (use rpk version):

Version:     v24.2.2
Git ref:     47443522ef
Build date:  2024-08-07T17:14:51Z
OS/Arch:     linux/amd64
Go version:  go1.22.2

Redpanda Cluster
  node-0  v24.2.2 - 47443522efaf0e1884b082a101d52dd3fdef984f
  node-1  v24.2.2 - 47443522efaf0e1884b082a101d52dd3fdef984f
  node-2  v24.2.2 - 47443522efaf0e1884b082a101d52dd3fdef984f

Container image: docker.redpanda.com/redpandadata/redpanda:v24.2.2
Kubernetes version: 1.29.10

What went wrong?

We have a client that would establish 8 connections to the cluster. Sometimes, one or few of the connections would fail and stay in the bad state until the client is restarted.

On the server side, we would see logs like below, repeats until the client is restarted.

2024-12-17 12:12:02.363	
INFO  2024-12-17 17:12:02,362 [shard 1:main] kafka - 172.18.107.224:60322 failed authorization - connection_context.cc:305 - proto: kafka rpc protocol, acl op: write, principal: type {user} name {bootstrap.cluster}, resource: topic-for-this-client

Cluster is using mTLS to authenticate. The client's cert has the common name client-1
We have a script to grant the client permissions

user=client-1
rpk security acl create --allow-principal $user --operation describe_configs --cluster
rpk security acl create --allow-principal $user --operation describe,describe_configs,all --topic topic-for-this-client --resource-pattern-type prefixed
rpk security acl create --allow-principal $user --operation all --group '*'

As described, the server nodes are using the CN bootstrap.cluster. The log shows the server CN for accessing a topic when the client failed to. The server certificate is only used by the cluster server nodes, not anywhere else.
Also, not all connections from the client fail, only part of the connections.

What should have happened instead?

The client should connect and gets authorized when a correct cert/key pair is provided

How to reproduce the issue?

We don't have a good way to reproduce the errors. It happens quite often when a new client image is rolled out.

Additional information

Please attach any relevant logs, backtraces, or metric charts.

JIRA Link: CORE-8613

@zhangluva zhangluva added the kind/bug Something isn't working label Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant