Skip to content

Conversation

Semisol
Copy link
Contributor

@Semisol Semisol commented Aug 28, 2025

This PR adds an experimental tuple decoder that is optimized for lower allocations and higher performance, support for fixed-length byte strings, along with fixing some panics.

This should be kept experimental until a way to handle the potentially breaking changes is found (possibly combining it with other breaking binding changes in a Go bindings "v2")

Over the original decoder, the new decoder attempts to zero-copy strings, and uses a new Boxed type which requires zero allocations compared to the any type which requires its contents are on the heap.
A neat hack relating to on-stack slice allocation is also used to remove allocations from append that arise from tuples while not having to estimate the length of the tuple for capacity.

The tests are also improved to ensure unpacking works correctly, and benchmarks are added.

This PR was closed and reopened due to separating this update to a separate branch

Benchmarking

To run benchmarks: go test -run=NONE -bench=^BenchmarkTupleUnpack

To-do list

  • Improve code documentation
  • Add fuzz tests that were used to discover panics that should have been errors when unpacking tuples
  • Handle the breaking change to google/uuid (we may be able to just alias tuple.UUID to it)
  • Make the V2 decoder the primary decoder, and remove the original decoder's code before merge.
  • Handle the fact that some users may expect Unpack to not depend on the input buffer. This could be "fixed" by copying the input buffer or by making a breaking v2 release (alongside possibly other binding improvements).
  • Add support for big.Int. The new decoder does not support this right now.
  • Add support for end-of-tuple type encoding and returning bytes after EOT
  • Consider adding support for 64-bit identifier type

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

This commit:
- Requires a minimum Go version 1.20 to allow using newer features
  in future commits
- Adds support for the length-prefixed bytes type
- Fixes certain panics detected by fuzzing on Unpack
- Switches to binary.BigEndian methods for decoding integers which
  eliminates allocations and is faster
- Switches to google/uuid for UUIDs to reduce conversion boilerplate
  as it seems to be the de-facto library for UUIDs
- Does some other refactors for a future tuple "builder" to reduce
  allocations and eliminate type-casting overhead
This commit adds an experimental tuple unpacker to the Go bindings
which reduces unnecessary allocations and tries to optimize decoding.

Tests for the FixedLen type are also added, alongside extending tests
to ensure the unpacked value is identical to the starting value and
adding benchmarks, and negative numbers.

- Zero-copy is used for strings and byte slices whenever possible to
  eliminate allocations.
- Suprisingly, a loop was faster than using standard library functions
  to locate the end of a string and check.
- A function table was used as it could allow extensibility in the
  future and may be faster on certain platforms.
- A custom Boxed type is used, which is larger than an interface{}
  but does not require contents to be allocated on the heap. This
  is effectively a tagged union.
- A neat hack is used in the unpacking loop to only do 1 allocation
  in the majority of tuple decoding cases.
@vishesh
Copy link
Contributor

vishesh commented Sep 4, 2025

Thanks for the PR. Curious what is breaking here? Is it the API or something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants