core: memory alignment feature addition #9174

leonardo-albertovich · 2024-08-07T14:16:01Z

This PR adds a new feature flag named FLB_ENFORCE_ALIGNMENT which can be used to ensure that when decoding the timestamp field from a record fluent-bit ensures that the memory access operation is aligned which is not the case normally due to how the msgpack wire protocol packs ext types.

While it's clear that exchanging one read operation for seven reads and a few more writes is undesirable to say the least, this is the only way I found to ensure that the output machine code was not tampered with in any of the compiler and architecture combinations.

There are two alternative implementations for this function using either two individual DWORD reads :

uint32_t __attribute__((optimize("-O0"))) FLB_ALIGNED_DWORD_READ(char *source) {
    uintptr_t      alignment_offset;
    unsigned char *aligned_address;
    uint64_t       result;

    alignment_offset = ((uintptr_t) source) % sizeof(uint32_t);
    aligned_address = (unsigned char *) &source[alignment_offset * -1];

    result = ((uint64_t *) aligned_address)[0];

    result >>= (alignment_offset * 8);
    result  &= 0xFFFFFFFF;

    return (uint32_t) result;
}

Or one QWORD read (which is translated to one "load pair of DWORDs" operation in ARMv7) :

uint32_t __attribute__((optimize("-O0"))) FLB_ALIGNED_DWORD_READ(char *source) {
    uintptr_t      alignment_offset;
    unsigned char *aligned_address;
    uint32_t       result[2];

    alignment_offset = ((uintptr_t) source) % sizeof(uint32_t);
    aligned_address = (unsigned char *) &source[alignment_offset * -1];

    result[0]   = ((uint32_t *) aligned_address)[0];
    result[0] >>= (alignment_offset * 8);

    result[1]   = ((uint32_t *) aligned_address)[1];
    result[1] <<= ((sizeof(uint32_t) - alignment_offset) * 8);

    return (uint32_t) result[0] | result[1];
}

However, both of these alternatives could exceed the bounds of a memory page given the "right" pointer resulting in a segment violation and more importantly, the code is way less straightforward and thus error prone with questionable (if any) performance gains in return.

I wanted to add this context so anyone who takes the time to review these changes or tries to improve the code in the future has enough information to avoid wasting time, getting frustrated or worse causing a regression.

Two additional useful pieces of information :

If you want to verify the machine code generated use godbolt is the easiest way to be able to inspect a wide range of architecture, compiler and compiler flag combinations.
Using Cpulator you can easily test the code because it does support the relevant CPU flags
It should be possible and maybe even desirable to create an integration test using either the unicorn emulator or qemu, however, that's not included in this PR.

Signed-off-by: Leonardo Alminana <[email protected]>

This will not change the underlying code for 99.9% of the users, however, when specifically enabled it through the FLB_ENFORCE_ALIGNMENT feature flag, FLB_ALIGNED_DWORD_READ will issue four BYTE sized reads instead of a single DWORD sized read to ensure that memory access is aligned. Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich added 4 commits August 7, 2024 15:58

build: added an option to enforce memory alignment

bd27266

Signed-off-by: Leonardo Alminana <[email protected]>

core: added a byte order detection abstraction macro

ba70894

Signed-off-by: Leonardo Alminana <[email protected]>

core: added aligned memory read functions

0f2b082

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich requested review from fujimotos, niedbalski, patrick-stephens, celalettin1286, edsiper and koleini as code owners August 7, 2024 14:16

leonardo-albertovich added this to the Fluent Bit v3.1.5 milestone Aug 7, 2024

github-actions bot added the docs-required label Aug 7, 2024

leonardo-albertovich temporarily deployed to pr August 7, 2024 14:16 — with GitHub Actions Inactive

leonardo-albertovich mentioned this pull request Aug 7, 2024

log_event_decoder: Fix misaligned load in timestamp decoder. #9096

Closed

1 task

leonardo-albertovich temporarily deployed to pr August 7, 2024 14:37 — with GitHub Actions Inactive

patrick-stephens added the ok-package-test Run PR packaging tests label Aug 8, 2024

edsiper merged commit bec6034 into master Aug 9, 2024
89 checks passed

edsiper deleted the leonardo-master-memory_alignment_enforcement branch August 9, 2024 19:48

BrewTestBot mentioned this pull request Aug 10, 2024

fluent-bit 3.1.5 Homebrew/homebrew-core#180727

Closed

This was referenced Aug 16, 2024

dockerfiles: Add FLB_ENFORCE_ALIGNMENT variable shuichiro-makigaki/fluent-bit#1

Closed

dockerfiles: Add FLB_ENFORCE_ALIGNMENT variable #9240

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: memory alignment feature addition #9174

core: memory alignment feature addition #9174

leonardo-albertovich commented Aug 7, 2024

core: memory alignment feature addition #9174

core: memory alignment feature addition #9174

Conversation

leonardo-albertovich commented Aug 7, 2024