forked from cilium/cilium
-
Notifications
You must be signed in to change notification settings - Fork 0
bpf, dbg: add vtep policy for flexible routing from/to outside world #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zstas
wants to merge
176
commits into
main
Choose a base branch
from
vtep_policy_cleanup
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The current accepted values for KPR are true and false only, hence other value will break helm install even if upgradeCompatibility is set to older versions (e.g. 1.8, 1.9, 1.12). Signed-off-by: Tam Mach <[email protected]>
Documents the request-timeout ingress annotation. Signed-off-by: iofq <[email protected]>
…Blocks cilium#41616 found a data race in setting Block.predict. Block was never meant to be changed during reachability analysis; all bookkeeping must be done out-of-band (like the live bitmap). This test reliably triggers the race (and others like it) when run with -race. Signed-off-by: Timo Beckers <[email protected]>
Block should not be modified as it's shared between users of (copies of) a
CollectionSpec. Store prediction results out-of-band instead.
Changing visitBlock() to take a Block instead of a *Block was considered
in addition, but proved too costly due to the amount of copying required.
goos: linux
goarch: amd64
pkg: github.com/cilium/cilium/pkg/bpf/analyze
cpu: AMD Ryzen 7 3700X 8-Core Processor
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
ComputeBlocks-16 642.5µ ± 1% 645.0µ ± 2% ~ (p=0.699 n=6)
Reachability-16 31.39µ ± 4% 33.83µ ± 2% +7.78% (p=0.002 n=6)
geomean 142.0µ 147.7µ +4.02%
│ old.txt │ new.txt │
│ B/op │ B/op vs base │
ComputeBlocks-16 372.6Ki ± 0% 372.6Ki ± 0% +0.01% (p=0.002 n=6)
Reachability-16 8.172Ki ± 0% 8.328Ki ± 0% +1.91% (p=0.002 n=6)
geomean 55.18Ki 55.71Ki +0.96%
│ old.txt │ new.txt │
│ allocs/op │ allocs/op vs base │
ComputeBlocks-16 8.054k ± 0% 8.054k ± 0% ~ (p=1.000 n=6) ¹
Reachability-16 3.000 ± 0% 4.000 ± 0% +33.33% (p=0.002 n=6)
geomean 155.4 179.5 +15.47%
¹ all samples are equal
Signed-off-by: Timo Beckers <[email protected]>
This change introduces a new feature that allows for tracing IPv4 packets with an embedded a trace ID in its IP option. Code changes include creating the feature flag, parsing, and the BPF map to store the trace ID. The following changes are included: - A new feature-gate, `ip-tracing-option-type`, is added to enable and configure the IP option type to be used for tracing. - Helper functions are implemented to parse IPv4 options and extract the trace ID. - Logic is added to save the parsed IPv4 options into a per-CPU array map, which can then be used by other parts of the system. Signed-off-by: Ben Bigdelle <[email protected]>
This change introduces the ability to parse IP options in ingress BPF programs. This is a prerequisite for implementing IP-based tracing on the ingress path. If the feature is enabled, and a trace ID exists for a packet, it is stored into the BPF map to be used in event messages. - Implemented parsing logic for IP options in ingress programs. - Store extracted IP options into per-CPU array map. Signed-off-by: Ben Bigdelle <[email protected]>
This change refactors the drop notify tests to be version-aware. This is a preparatory step to allow for the introduction of new fields to the `DropNotify` struct in a backward-compatible manner. The tests are updated to: - Define separate test cases for different versions of the `DropNotify` struct. Signed-off-by: Ben Bigdelle <[email protected]>
This change extends the `DropNotify` struct to include the IP trace ID. The following changes are included: - The `DropNotify` struct in the control plane is updated to include the `IPTraceID` field. - The BPF code is updated to check to see if there is a stored trace ID in the BPF map and, if so, populating it. - The `cilium-monitor` output is updated to display the IP trace ID when present in a drop notify message. Signed-off-by: Ben Bigdelle <[email protected]>
This change extends the `TraceNotify` struct to include the IP trace. The following changes are included: - The `TraceNotify` struct in the control plane is updated to include the `IPTraceID` field. - At the creation of a TraceNotify event, check to see if IP trace is stored in the BPF map and populate it in the message if so. - The `cilium-monitor` output is updated to display the IP trace ID when present in a trace notify message. Signed-off-by: Ben Bigdelle <[email protected]>
This change introduces the IPTraceID field to the Hubble protobuf. This allows IP-based tracing information to be propagated and associated with flows observed by Hubble. The following changes are included: - A new `IPTraceID` message type is defined in `flow.proto`, containing the trace ID and the IP option type. - The Hubble parser is updated to decode the IP trace ID from monitor events (both drop and trace notifications) and populate the `ip_trace_id` field in the resulting `Flow` message. - The Hubble printer is updated to display the IP trace ID in the output. Signed-off-by: Ben Bigdelle <[email protected]>
This change introduces the ability to filter Hubble flows by IP trace ID directly from the Hubble CLI. The following changes are included: - A new `--ip-trace-id` flag is added to the `hubble observe` command, which can be specified multiple times to filter for multiple trace IDs. - A new `IPTraceIDFilter` is implemented to perform the filtering logic based onthe provided trace IDs. - The `IPTraceIDFilter` is added to the list of default filters. - The help text for the `hubble observe` command is updated to include the new flag. Signed-off-by: Ben Bigdelle <[email protected]>
we stored the port name in the frontend mapping struct,
but we didn't add it to the frontend params.
if we have this type of lrp
apiVersion: "cilium.io/v2"
kind: CiliumLocalRedirectPolicy
metadata:
name: "lrp-addr"
spec:
redirectFrontend:
addressMatcher:
ip: "169.254.169.254"
toPorts:
- port: "8080"
name: "test"
protocol: TCP
- port: "8081"
name: "test1"
protocol: TCP
redirectBackend:
localEndpointSelector:
matchLabels:
app: proxy
toPorts:
- port: "80"
name: "test"
protocol: TCP
- port: "81"
name: "test1"
protocol: TCP
and pod
apiVersion: v1
kind: Pod
metadata:
name: lrp-pod
labels:
app: proxy
spec:
containers:
- name: lrp-pod
image: nginx
ports:
- containerPort: 80
name: test
protocol: TCP
- containerPort: 81
name: test1
protocol: TCP
we will end up with
6 169.254.169.254:8080/TCP LocalRedirect 1 => 10.244.1.75:80/TCP (active)
2 => 10.244.1.75:81/TCP (active)
7 169.254.169.254:8081/TCP LocalRedirect 1 => 10.244.1.75:80/TCP (active)
2 => 10.244.1.75:81/TCP (active)
with this PR, we will get the correct backend
8 169.254.169.254:8080/TCP LocalRedirect 1 => 10.244.1.30:80/TCP (active)
9 169.254.169.254:8081/TCP LocalRedirect 1 => 10.244.1.30:81/TCP (active)
Signed-off-by: Liyi Huang <[email protected]>
Signed-off-by: Aditi Ghag <[email protected]>
Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com>
We already had logic to prune statically unreachable tail calls, tail calls which were unreachable due to compile time macros. However, we did not prune tail calls which were unreachable due to load time config. This commit updates the existing logic to also prune tail calls which are unreachable due to load time config. This should allow us to migrate macros that control reachability of tail calls to load time config. Signed-off-by: Dylan Reimerink <[email protected]> Signed-off-by: Timo Beckers <[email protected]>
Move the logic for pruning unused tail calls from collection.go to a new file. Signed-off-by: Dylan Reimerink <[email protected]>
The `LoadCollectionSpec` function was a wrapper around `ebpf.LoadCollectionSpec` that additionally did the unused tail call pruning. Now that this functionality has been moved into the `LoadCollection` function, this wrapper is no longer needed. Signed-off-by: Dylan Reimerink <[email protected]>
Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com>
Signed-off-by: Cilium Imagebot <[email protected]>
Fixes CI flake cilium#41550 With multiple NodeUpdate/NodeDelete events firing at once, the order that the state will be committed to the file is non-determisitc. Instead of deleting the checkpoint file and waiting for the correct state to be written back, we poll the file, reading until the correct state is found or timeout. Signed-off-by: Charlie Kenney <[email protected]>
Update procedure of Cilium Open Source installation on Rancher managed cluster using HelmCharts and Helm Operator during bootstrapping of the cluster. Signed-off-by: Filip Wardzichowski <[email protected]>
matching port numbers on different l4 protocol don't work as expected. this commit fixes it by grouping all ports on a pod and produces the services, backends and frontends links instead of doing it on a per port per container basis. Signed-off-by: Bernardo Soares <[email protected]>
Use a single command line for both cilium agent and cilium operator. Signed-off-by: André Martins <[email protected]>
This remove the usage of GlobalServiceCache in the agent which was only useful to count the number of global Service. This count didn't accounted the local cluster and thus is misleading. While performance impact was not tested this removes managing two level of nested maps and a global lock on each remote endpoints updates which should certainly be valuable. The global services count reported through cilium-dbg and the CLI is no longer supported/exposed. Users with an older version of the CLI would always see a count of 0 reported. Global Service counts will continue to be reported per cluster along the count with other resources though. Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
Report per cluster metrics using the watch store for endpoints, global services, MCS service exports like we are doing already for remote nodes. This doesn't include identities unfortunately which doesn't use the watch store. We are no longer attempting to report Global Services and Global Service Export count. Note that those global count were not accounting the local cluster which was misleading. Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
This was only needed for the upgrade from v1.17, and can now safely go away in the v1.19 release. Signed-off-by: Julian Wiedmann <[email protected]>
This was only needed for the upgrade from v1.17, and can now safely go away in the v1.19 release. Signed-off-by: Julian Wiedmann <[email protected]>
Ideally we're not using the ipcache to determine the source identity, but fully rely on the identity transported via VNI. Condense the inbound path a bit to have more clarity in which cases we still require a ipcache lookup. Signed-off-by: Julian Wiedmann <[email protected]>
… to v1.19.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com>
L4AddrFromString takes in a string formatted L4Addr (as in 443/tcp) and returns an equivalent L4Addr struct Signed-off-by: Bernardo Soares <[email protected]>
Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
The BPF programs will contain the SkipLB map definition regardless of whether this is used or not. This can cause flakes in CI if the loader races loading endpoint programs in parallel and each of those are trying to pin the skiplb maps. To avoid this always open and pin the skiplb maps. The error in logs was: level=error msg="Error while reloading endpoint BPF program" ... error="loading eBPF collection into the kernel: map cilium_skip_lb6: pin map to .../cilium_skip_lb6: file exists" Signed-off-by: Jussi Maki <[email protected]>
This is another goroutine forked by workqueue which we can't reliably wait for from ShutDown(). Signed-off-by: Jussi Maki <[email protected]>
It is possible to update the local node to hold an inconsistent ENI state which prevents correct ENI device configuration. This change makes setOwnNodeWithoutPoolUpdate handle local node updates consistent with how updateLocalNodeResource handles it. Fixes: cilium#41626 Signed-off-by: Jason Aliyetti <[email protected]>
Some cleanup missed in commit: Commit: e202a4e Author: Louis DeLosSantos <[email protected]> Date: Wed Oct 30 11:01:50 2024 -0400 ipsec: despecify decrypted overlay We no longer use the EncryptedOverlayReqID constant anywhere in the codebase to specify a reqid for overlay traffic. Remove this constant. Signed-off-by: Louis DeLosSantos <[email protected]>
This is highly verbose and doesn't see enough use to be enabled with debug. It can still be enabled explicitly. Signed-off-by: David Bimmler <[email protected]>
There's a shared file in statedir/endpoint-policy.log that is written to by all endpoints which have policy debug logging enabled. Prior to this commit, they'd all allocate their own lumberjack.Logger wrapping a file descriptor for this file. That's broken since a lumberjack logger tracks the size of writes to determine when to rotate the logfile. Since they'd all track individual writes of endpoints, the size of the file could get large without ever rotating. Worse, once rotated, all other endpoints would still write to their FD, ie the old file. Clean this up by sharing a single logger, which is likely a bottleneck, but one that shouldn't be hit in prod since this is clearly a debug option. Signed-off-by: David Bimmler <[email protected]>
This commit fixes an `Owns` call that was updated to use EndpointSlice (instead of Endpoint) in This change was missed in the refactor in cilium#41323. Signed-off-by: Nick Young <[email protected]>
This is a move/rename only commit preparing the structure to refactor the namespace manager as a Cell. Signed-off-by: Alexandre Perrin <[email protected]>
For testing purposes, reducing the use of the public namespace.NewManager. Cosmetic dedup in local_observer_test.go on the way, making noopParser accept testing.TB as param. Signed-off-by: Alexandre Perrin <[email protected]>
Setup the namespace cleanup as a job.Timer instead of open-coding in our own goroutine. Signed-off-by: Alexandre Perrin <[email protected]>
This commit extends the pkg/shell to allow configuring the shell socket path via cell config. This is useful in all those cases in which we may want to leverage pkg/shell for IPC (eg. in tests with multiple forked processes) or if we just want to change the default path for convenience. Documentation updates have been generated accordingly. Signed-off-by: Simone Magnani <[email protected]>
Signed-off-by: Jarno Rajahalme <[email protected]>
When Cilium starts in tunnel mode, a route for each remote node pod CIDRs is added to the current node. For these routes, the MTU is set to 1450, to include the tunnel overhead. If the routing mode is then changed to native and Cilium is restarted, the stale routes should be deleted at startup. Unfortunately, this is currently not happening because the MTU is set to 1500 in the deletion request, resulting in the folowing error from netlink: msg="Unable to delete route" ... error="no such process" In other words, the route cannot be found because the MTU value set in the deletion request is not matching the one of the installed route. To solve this, just disregard the MTU value while deleting a route. This should still allow to correctly remove stale IPsec related routes as intended in commit 35ca979 Fixes: 35ca979 ("datapath/linux/route: Fix Delete") Fixes: cilium#41811 Signed-off-by: Fabio Falzoi <[email protected]>
Fill the [metav1.TypeMeta] for objects added via the Clientset if the
TypeMeta is unset.
E.g. CoreV1().Nodes().Create(&Node{ObjectMeta{Name: "foo"}}) would have
previously created a node object with Kind="" and APIVersion="" and with
this it'll have Kind="Node" and APIVersion="v1".
Signed-off-by: Jussi Maki <[email protected]>
Hostfw and ipsec aren't compatible. Signed-off-by: darox <[email protected]>
* Fix v6 utils that had `svc_one` hardcoded. * Improve `pkt_defs.py` by using `()` instead of \. Signed-off-by: Marc Suñé <[email protected]>
Changes introduced in df5501e missed to assign the interface_mac, in the ipv6 ND tests, resulting in packets sent with the NULL MAC address. The test passes as asserts use the same value to check against. Use another (valid) MAC for the interface. Signed-off-by: Marc Suñé <[email protected]>
Commit 11c329f fixed handling of ICMPv6 neighbour solicitations that didn't have Link Layer Source option. For NA, it adds the 8 additional bytes of the option. While moving IPv6 NDP unit tests to scapy, unit tests failed due to an incorrect ICMPv6 checksum for non-LLSRC opt NS packets was. The problem is that ICMPv6 pseudoheader contains the payload length and the code was not considering it as part of the csum diff. This was not spotted because: * Unit tests (before scapy) don't check csums. * When L4 csum is offloaded, it really doesn't matter. This commit changes `icmp6_send_ndisc_adv()`: * Fixes the csum accordingly * Removes an unnecessary call to `l4_csum_replace()`, by accumulating the csum diff in `sum`. NOTE: it would be a good idea to refactor `icmp6_send_ndisc_adv()` to use direct packet access _and_ avoid superfluous copies of the (new) icmpv6 hdr and (new) opts. Seems like a good first issue :). Signed-off-by: Marc Suñé <[email protected]>
Fix ASSERT_CTX_BUF_OFF() implementation incorrectly accessing data (in stack) instead of __data (in the body of the assert). Move aux pointers to __DATA, __DATA_END to avoid this in the future. Signed-off-by: Marc Suñé <[email protected]>
This commit adapts the IPv6 NDP BPF unit test to scapy. Signed-off-by: Marc Suñé <[email protected]>
Create an auxiliary function that encapsulates the return code checks, reducing drastically the number of lines. Signed-off-by: Marc Suñé <[email protected]>
This commit fixes an issue for encrypted packets arriving in bpf_host. When checking that they are encrypted using the packet mark, we shouldn't expect the mark to be equal to MARK_MAGIC_DECRYPT. Instead, we should check that the MARK_MAGIC_DECRYPT bit is set. This issue isn't affecting anything today, but will once we support IPsec + BPF Host Routing. Fixes: 1dadae3 ("bpf: Don't skip local delivery for plain-text packets") Signed-off-by: Paul Chaignon <[email protected]>
33c7688 to
d623d6a
Compare
Co-developed-by: Jarrod Baumann <[email protected]> Signed-off-by: Jarrod Baumann <[email protected]> Signed-off-by: Stanislav Zaikin <[email protected]>
d623d6a to
d89617a
Compare
95fbccb to
d89617a
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Co-developed-by: Jarrod Baumann [email protected]
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXXline if the commit addresses a particularGitHub issue.
Fixes: <commit-id>tag, thenplease add the commit author[s] as reviewer[s] to this issue.
Fixes: #issue-number