Add bytea casts #21

tsutsu · 2021-03-05T16:58:38Z

Submitted for consideration of whether this is something useful to upstream. This shouldn't be merged as-is (no docs or benchmarks yet.)

This patch enables efficient interconversion between mpz and bytea, where the bytea is interpreted as a "packed big-endian" or "base-256" bitstring representation of an integer.

Our company works to analyze data sourced from Ethereum, where most numeric data is represented as a "uint256" (256-bit unsigned integer) type, usually transmitted serialized as a hex value. We store billions of these uint256 values in our DB. We index them, aggregate over them, and also bulk-encode(them, 'hex') for presentation. Sometimes they're actually numbers. Sometimes they're not. Thinking of the raw data as something more like "arbitrary contents of a 32-byte-wide machine vector register" might make more sense.

We have found that storing the data as numeric, while efficient for math, is highly inefficient for converting-to-hex (it's very difficult to write a native base conversion routine between numeric's base-10000, and hex's base-16.) Storing the data as an mpz would almost work, but breaks wire compatibility for clients that want to consume the data in its native binary representation (e.g. Elixir's Postgrex library.)

Ultimately, we have chosen to store these values in Postgres as raw byteas. This gives us the highest storage efficiency; allows us to use the native, highly-efficient Postgres function encode(col, 'hex'); and also is a lossless transformation from the original hex-encoded representation, for cases where the value turns out to be non-numeric (e.g. a packed struct) where we'd want to retain and re-create leading zeroes on encode.

A bytea would normally have no efficient path to performing math operations upon, but with this patch, we can cheaply cast bytea (base-256) values to mpz (base-4294967296), perform the aggregate, and then encode the result as hex (or back to bytea).

This has worked exceedingly well for us so far. We have been using this patch in production for around two years now, with no hiccups.

The only issue with it, is that it's not upstreamed, so we have to manually build and install our own fork of pgmp for every Postgres instance we run!

If you like this code/the idea behind it, let me know what should be done to polish it up and get it ready to contribute. Thanks!

P.S. In our production databases, we have also defined implicit assignment casts between the numeric and bytea types, that take the value through mpz as an intermediate representation. The cast from mpz to numeric is not particularly efficient, but due to GMP's highly-efficient data structures, it seemed to still be cheaper for bulk conversions than the memory access pattern created by the naive direct base-256 to base-10000 conversion routine I wrote as an alternative. (Though, obviously, pmpq_to_numeric could probably still be optimized further; it allocates an intermediate string!)

However efficient it is, it's definitely a better option than doing this base-conversion in PL/pgSQL. And that fact—plus having it "built in" to a library that gets packaged by Debian et al—is "good enough" for us, and probably most people. As such, it might make sense to consider having this library also define bytea↔numeric casts, iff the DB doesn't already have them.

dvarrazzo · 2021-03-05T17:32:38Z

Thank you, it is an interesting use case. A few thoughts...

I understand that, as the mpz_import/export do, these operations drop the sign, right?

I wonder if, instead of a cast, it wouldn't be appropriate to expose some import/export functions instead? The cast mnz to bytea is something I have used heavily for development to deal with the structure as a whole (i.e. a no-function cast) and I wouldn't really want to lose that feature.

Something that ties probably with your need to convert between binary and mpz is the implementation of binary send/receive functions (see #5). It would be mandatory that those functions dealt with the sign though.

All in all it seems a feature surely useful for use case, but because it is not completely generic (it doesn't cover the entire mpz domain) maybe it would be better implemented as functions. I don't think there is any performance difference between a cast and a function, right?

tsutsu · 2021-03-05T20:19:54Z

It makes sense to me that functions that don't deal with the entire domain would not make for valid casts or send/receive functions. We've only been dealing with encodings of values that are always-unsigned, so this hasn't been a problem for us so far, but it's definitely important to make choices that are widely applicable.

My thoughts:

pmpz_to_bytea should probably automatically encode negative mpz values into bytea using two's-complement representation.
The question, then, is how long the resulting bytea would be, given that external systems may be zero-extending the received value to some fixed bit-width and only then checking it for sign. The most flexible solution to that problem, is to make the export function arity-3, where the third parameter is an expected "register bit-width" to export to. The resulting bytea would then be at least that long.
The import function bytea_to_pmpz could be exposed as a function with both unsigned-import and signed-two's-complement-import variants. For the signed-two's-complement-import variant, it could have an optional register-bit-width parameter, which, if specified, would treat the value as zero-extended to that bit-width before trying to determine the sign. Without the parameter, the data in the bytea would be treated as having been sign-extended to infinite bit-width.
Useful casts could still be introduced, but rather than direct interconversion between mpz and bytea, it would make more sense to introduce a Postgres DOMAIN for always-non-negative mpz (which could be called mpn, since PGMP doesn't expose GMP's mpn); and then to define the implicit casts as interconverting between mpn and bytea. This would provide an easy+convenient querying interface for those who go to the initial effort of getting the DB to validate a column as being specifically mpn, while allowing everyone else who's just holding a regular mpzs to express their specific intent at query-time with an explicit 3-arg conversion function.
If PGMP introduced octet_length-constrained DOMAINS for bytea (e.g. bytea1, bytea2, bytea3, ... bytea32) — where the DOMAIN is only an upper limit (so values can be packed shorter), but where the type provides a hint about the intended or original, untrimmed size of the octets — then implicit casts could be defined between these types and mpz, with proper two's-complement semantics, as each cast would know the proper value for the register-width parameter. Of course, there could also be casts defined between mpn and these bytea DOMAIN types, which would be cheaper than the mpz equivalents.

This would serve our own use-case pretty well — our production DB value columns could be (quite correctly) re-typed to be bytea32; and we would rewrite our queries to use ::mpn rather than ::mpz casts.

Even without the introduction of all these additional types and casts, we could still put them in our own DB, as long as the core arity-3 functions were there to define them in terms of. But I feel that at least the introduction of mpn and the cast between mpn and bytea is a very "obvious" and universally-useful feature for users of this library. The byteaN types and their respective casts might be widely-useful as well, although I'm not as sure.

tsutsu · 2021-06-08T15:58:14Z

I've started working on making the changes I suggested earlier. (Our data scientist found a place where we need to interpret our bytea values as signed mpz values, so it got prioritized. 😄 )

I've added an explicit PG function, for now called mpz_2c(bytea) (underlying C function: pmpz_from_bytea_signed), that acts like the explicit PG function mpz(any), but which treats the bytea as being a representation of a big-endian machine-register dump of a two's complement signed integer. The octet length of the bytea is assumed to be the machine-register width, such that if the bytea's length is nonzero, and the MSB of the first byte of the bytea is 1, then the value is negative.

I'm unsure whether this is the most efficient implementation for doing two's complement absolute-value "during" a libgmp import of a stream of bytes. I couldn't come up with a good way to get libgmp to do the whole absolute-value step itself (it seems like it'd require an allocation of a temporary mpz to hold an appropriate xor value), so I had to do part of it before the import, requiring an extra palloc+pfree (which really annoys me, since the positive version manages to get away with no temporary local allocations, only the escaping allocation of the resulting mpz.) Any optimization advice would be appreciated, before I add additional features. (e.g. does VARDATA_ANY give you a copy, such that it would actually be safe to destructively modify the byte-buffer returned by it?)

Also, again, let me know whether you think the design I outlined above (with the DOMAIN types et al) is one worth pursuing / one you'd want to have shipped as part of the extension, before I go and commit to actually implementing all of that.

Add bytea IO functions

9163b39

Add 2c-signed bytea import function

91b6e74

tsutsu force-pushed the feature-bytea-io-conv branch from 85c3600 to 91b6e74 Compare June 7, 2021 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bytea casts #21

Add bytea casts #21

tsutsu commented Mar 5, 2021 •

edited

Loading

dvarrazzo commented Mar 5, 2021

tsutsu commented Mar 5, 2021 •

edited

Loading

tsutsu commented Jun 8, 2021 •

edited

Loading

Add bytea casts #21

Are you sure you want to change the base?

Add bytea casts #21

Conversation

tsutsu commented Mar 5, 2021 • edited Loading

dvarrazzo commented Mar 5, 2021

tsutsu commented Mar 5, 2021 • edited Loading

tsutsu commented Jun 8, 2021 • edited Loading

tsutsu commented Mar 5, 2021 •

edited

Loading

tsutsu commented Mar 5, 2021 •

edited

Loading

tsutsu commented Jun 8, 2021 •

edited

Loading