Skip to content

Conversation

mmat11
Copy link
Contributor

@mmat11 mmat11 commented Sep 16, 2025

Context: #396

Copy link

codecov bot commented Sep 16, 2025

Codecov Report

❌ Patch coverage is 75.55556% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.51%. Comparing base (b523ed4) to head (3825266).

Files with missing lines Patch % Lines
pkg/components/ebpf/common/httpfltr_transform.go 74.19% 7 Missing and 1 partial ⚠️
pkg/config/ebpf_tracer.go 50.00% 2 Missing ⚠️
pkg/components/ebpf/generictracer/generictracer.go 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #631      +/-   ##
==========================================
- Coverage   61.52%   61.51%   -0.01%     
==========================================
  Files         199      199              
  Lines       22086    22105      +19     
==========================================
+ Hits        13588    13598      +10     
- Misses       7527     7535       +8     
- Partials      971      972       +1     
Flag Coverage Δ
integration-test-arm 34.46% <73.33%> (-0.08%) ⬇️
k8s-integration-test-1861- ?
k8s-integration-test-1863- 50.10% <73.33%> (?)
oats-test 37.11% <68.88%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mmat11 mmat11 marked this pull request as ready for review September 17, 2025 23:27
@mmat11 mmat11 requested a review from a team as a code owner September 17, 2025 23:27
@mmat11 mmat11 changed the title DRAFT: bpf: use configurable large buffers for HTTP requests bpf: use configurable large buffers for HTTP requests Sep 17, 2025
Copy link
Contributor

@rafaelroquetto rafaelroquetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

Just a bunch of minor comments. The more pressing of them IMHO relates to streamlining the buffer sizes - currently we have a lot of them, and it's not clear to the naked eyes / future maintainer how (and if) they relate.

My recommendation is to have the three sizes (HTTP, MySQL and Postgres) as the source of their respective truths.

It's then possible to adjust the respective map value sizes at load time to match those, and also to be a bit more lax on the accepted values: instead of hardcoding them, we can constrain them to the required alignment and adjust the final value - this is less error prone because everything is enforced as a rule, rather than explicit comparisons, analogous to

RingBufferSize uint32 `yaml:"ring_buffer_size" env:"OTEL_EBPF_NETWORK_RING_BUFFER_SIZE"`

// TODO(matt): validate all the existing attributes

switch c.BufferSizes.HTTP {
case 0, 128, 256, 512, 1024, 2048, 4096, 8192:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hardcoding those values, we should generalise

func effectiveRingBufferSize(size uint32) uint32 {

Usage:

ringBufferSize := effectiveRingBufferSize(rbSizeMB * 1024 * 1024)

large_buf->type = EVENT_TCP_LARGE_BUFFER;
large_buf->packet_type = packet_type;
large_buf->action = action;
__builtin_memcpy((void *)&large_buf->tp, (void *)&req->tp, sizeof(tp_info_t));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__builtin_memcpy((void *)&large_buf->tp, (void *)&req->tp, sizeof(tp_info_t));
large_buf->tp = req->tp;

const u32 buf_len_mask = http_buffer_size - 1;

tcp_large_buffer_t *large_buf = (tcp_large_buffer_t *)http_large_buffers_mem();
if (!large_buf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!large_buf) {
if (!large_buf) {

__builtin_memcpy((void *)&large_buf->tp, (void *)&req->tp, sizeof(tp_info_t));

large_buf->len = bytes_len;
if (large_buf->len >= http_buffer_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (large_buf->len >= http_buffer_size) {
if (large_buf->len >= http_buffer_size) {

Comment on lines +26 to +29
enum {
k_http_large_buf_max_size = 1 << 14, // 16K
k_http_large_buf_max_size_mask = k_http_large_buf_max_size - 1,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't need these. Instead, you can use SCRATCH_MEM_NAMED(http_large_buffers, tcp_large_buffer_t)

Then, in userspace, you effectively set the value size of this map to be that of the http buffer (if > 0) like you for http_buffer_size.

We do a similar thing here:

spec.Maps["direct_flows"].MaxEntries = ringBufferSize

But instead of .MaxEntries you can use .ValueSize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do this, but we don't save much per cpu, I wonder if it's worth adding this complication for a few kbytes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only that, but the main point is to streamline all of these constants. Currently we have http_buffer_size being set from userspace, and some defined in kernel space, and they are not orthogonal to each other - so its easy for someone to break the code by adjusting k_http_large_buf_max_size to a size that ends up being less than the max allowed http_buffer_size for instance - they need to know to change it in both places.

On the other hand, userspace can calculate the "effective http buffer size" from the original BufferSizes.HTTP value (e.g. rounding it to the next power of 2) and ensure both the map value size (the scratch mem) and http_buffer_size remain compatible (e.g. they could be the same when the buffer size is > 0, otherwise the map value size can be 1 to pass the verifier)

req->has_large_buffers = true;

bpf_ringbuf_output(
&events, large_buf, total_size & k_http_large_buf_max_size_mask, get_flags());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may not need the mask here (in my tests I did not, but I did not test with older kernels).

If we can remove it, then we can do without those k_http_large_buf... constants altogether and rely on a single source of truth - way less error prone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will test it without mask, I just put it out of habit :D

}
requestBuffer = b

// TODO: response?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean we are not using/consuming the buffers being set for the response? If that's the case, then we should remove that part from the ebpf code as well (i.e. not send any response buffers) until this is done - otherwise we are just wasting processing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not being used as of now, we do need it shortly though - I have graphql support in WIP state which needs this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest moving it as part of that PR then

}
result.URL = event.url()
result.Method = event.method()
result.URL = httpURLFromBuf(requestBuffer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each of these extraction methods, we are always allocating a new string no matter what e.g. buf := string(req) - which copies the underlying buffer, just to discard it sometimes when we do not find what we are looking for.

Instead we should strive to only allocate when we really need to return a non empty-value. This predates this PR but since we are touching these, it's a good opportunity to fix them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants