Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: memory alignment feature addition #9174

Merged
merged 4 commits into from
Aug 9, 2024

Conversation

leonardo-albertovich
Copy link
Collaborator

This PR adds a new feature flag named FLB_ENFORCE_ALIGNMENT which can be used to ensure that when decoding the timestamp field from a record fluent-bit ensures that the memory access operation is aligned which is not the case normally due to how the msgpack wire protocol packs ext types.

While it's clear that exchanging one read operation for seven reads and a few more writes is undesirable to say the least, this is the only way I found to ensure that the output machine code was not tampered with in any of the compiler and architecture combinations.

There are two alternative implementations for this function using either two individual DWORD reads :

uint32_t __attribute__((optimize("-O0"))) FLB_ALIGNED_DWORD_READ(char *source) {
    uintptr_t      alignment_offset;
    unsigned char *aligned_address;
    uint64_t       result;

    alignment_offset = ((uintptr_t) source) % sizeof(uint32_t);
    aligned_address = (unsigned char *) &source[alignment_offset * -1];

    result = ((uint64_t *) aligned_address)[0];

    result >>= (alignment_offset * 8);
    result  &= 0xFFFFFFFF;

    return (uint32_t) result;
}

Or one QWORD read (which is translated to one "load pair of DWORDs" operation in ARMv7) :

uint32_t __attribute__((optimize("-O0"))) FLB_ALIGNED_DWORD_READ(char *source) {
    uintptr_t      alignment_offset;
    unsigned char *aligned_address;
    uint32_t       result[2];

    alignment_offset = ((uintptr_t) source) % sizeof(uint32_t);
    aligned_address = (unsigned char *) &source[alignment_offset * -1];

    result[0]   = ((uint32_t *) aligned_address)[0];
    result[0] >>= (alignment_offset * 8);

    result[1]   = ((uint32_t *) aligned_address)[1];
    result[1] <<= ((sizeof(uint32_t) - alignment_offset) * 8);

    return (uint32_t) result[0] | result[1];
}

However, both of these alternatives could exceed the bounds of a memory page given the "right" pointer resulting in a segment violation and more importantly, the code is way less straightforward and thus error prone with questionable (if any) performance gains in return.

I wanted to add this context so anyone who takes the time to review these changes or tries to improve the code in the future has enough information to avoid wasting time, getting frustrated or worse causing a regression.

Two additional useful pieces of information :

  1. If you want to verify the machine code generated use godbolt is the easiest way to be able to inspect a wide range of architecture, compiler and compiler flag combinations.
  2. Using Cpulator you can easily test the code because it does support the relevant CPU flags
  3. It should be possible and maybe even desirable to create an integration test using either the unicorn emulator or qemu, however, that's not included in this PR.

This will not change the underlying code for 99.9% of the users,
however, when specifically enabled it through the FLB_ENFORCE_ALIGNMENT
feature flag, FLB_ALIGNED_DWORD_READ will issue four BYTE sized reads
instead of a single DWORD sized read to ensure that memory access is
aligned.

Signed-off-by: Leonardo Alminana <[email protected]>
@leonardo-albertovich leonardo-albertovich added this to the Fluent Bit v3.1.5 milestone Aug 7, 2024
@patrick-stephens patrick-stephens added the ok-package-test Run PR packaging tests label Aug 8, 2024
@edsiper edsiper merged commit bec6034 into master Aug 9, 2024
89 checks passed
@edsiper edsiper deleted the leonardo-master-memory_alignment_enforcement branch August 9, 2024 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants