-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DPE-6371] Add TLS support #4
base: main
Are you sure you want to change the base?
Conversation
9a95f78
to
4498bd3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks very very good! I feel that we really do have a lot of re-usable code and patterns for TLS that we can use across the board. Reviewing these PRs is becoming almost boring :P I believe the next step here would really be to decouple the classes into a common class, and then implement the specifics bit in a subclass. It is similar to a comment I had to a PR in Spark, see here. Maybe you can pair between teams to discuss this
Also by taking inspiration from the PR in Spark, I only have a small point to add a "direct" check for the certificate in the integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! thank you for addressing the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding work - though I think you're a victim of success here. I mentioned in a comment already, but I think that a lot of the small tasteful changes you've made here, really should be ported back to Kafka. I think this repo is shaping up to be the example that we should follow in Kafka.
src/events/connect.py
Outdated
elif self.context.ready: | ||
self.charm._set_status(Status.SERVICE_NOT_RUNNING) | ||
else: | ||
self.charm._set_status(self.context.status) | ||
|
||
def _on_config_changed(self, event: ConfigChangedEvent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Here you want to add something similar to what we have in Kafka, to ensure that we request new certificates when our SANs change (if IP changed during a pod reschedule, for example). Example from Kafka:
current_sans = self.tls_manager.get_current_sans()
if not (properties and zk_jaas):
# Event fired before charm has properly started
event.defer()
return
current_sans_ip = set(current_sans["sans_ip"]) if current_sans else set()
expected_sans_ip = set(self.tls_manager.build_sans()["sans_ip"]) if current_sans else set()
sans_ip_changed = current_sans_ip ^ expected_sans_ip
current_sans_dns = set(current_sans["sans_dns"]) if current_sans else set()
expected_sans_dns = (
set(self.tls_manager.build_sans()["sans_dns"]) if current_sans else set()
)
sans_dns_changed = (current_sans_dns ^ expected_sans_dns) - {
# we omit 'kafka/{unit_id}' and 'kafka' here to avoid a bug with Digicert not supporting '/' characters in SANs
# Digicert truncates the 'kafka/{unit_id}' to just 'kafka'
# i.e don't assume we need new certs if 'diff' includes those value, as these SANs aren't typically used anyway
self.charm.state.unit_broker.unit.name,
self.charm.state.cluster.app.name,
}
# update environment
self.config_manager.set_environment()
Expand Down
self.charm.unit.set_workload_version(self.workload.get_version())
if sans_ip_changed or sans_dns_changed:
logger.info(
(
f'Broker {self.charm.unit.name.split("/")[1]} updating certificate SANs - '
f"OLD SANs IP = {current_sans_ip - expected_sans_ip}, "
f"NEW SANs IP = {expected_sans_ip - current_sans_ip}, "
f"OLD SANs DNS = {current_sans_dns - expected_sans_dns}, "
f"NEW SANs DNS = {expected_sans_dns - current_sans_dns}"
)
)
self.charm.tls.certificates.on.certificate_expiring.emit(
certificate=self.charm.state.unit_broker.certificate,
expiry=datetime.now().isoformat(),
) # new cert will eventually be dynamically loaded by the broker
self.charm.state.unit_broker.update(
{"certificate": ""}
) # ensures only single requested new certs, will be replaced on new certificate-available event
return # early return here to ensure new node cert arrives before updating advertised.listeners
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That paste was awful. It's in canonical/kafka-operator#297
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question how would the SANs change here? I mean, pod rescheduling would always re-request certificates I would think, no? it is like a new unit coming up live... We don't have extra listeners / sans or load-balancing yet, right?
Having said this, probably have the code to be already structured when the IP may change is not bad, I'm just wondering whether we would already have a use-case (and therefore a test for this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To test the logic, I've already added a unit test. We could add an integration test by simulating an IP change with lxc config device add {machine_name} eth0 none
and then lxc config device remove {machine_name} eth0
but I think it's more applicable to k8s ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the pod reschedules, it's still the same 'unit', so it would grab the existing certs from relation data. But the IP would be different now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though come to think of it, we should probably add the expose-external
logic here. I'll put it in the next sprint.
@deusebio @marcoppenheimer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Great job!
Changes:
kafka-client
interfaceNote: There is an issue with
kafka-operator
when TLS is enabled onkafka-client
interface, reported here, which affects this PR. once that issue is fixed, thekafka-client + TLS
should work properly on this charm.