-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MetricSummary::bounds cleanup #390
base: main
Are you sure you want to change the base?
Conversation
MetricSummary: - Delegate to I64Range for infinity rather than duplicating the concept. I64Range - Cleanup and encapsulate I64Range's handling of infinite bounds. - Enforce validity in new factory functions `empty` and `infinite`. (method `extend` already did) - Expose #[inline] left and right methods. - Privatize left and right fields. - Factor out new method `both` out of various places that needed access to finite left and right. Closes issue #386.
let (left, right); | ||
match self.bounds.both() { | ||
None => return Err(CounterError::BoundsInvalid), | ||
Some((l, r)) => { | ||
left = l; | ||
right = r; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems there really should be a cleaner way to do this unwrap_or_return
behavior, but if there is I don't know it.
@@ -0,0 +1,62 @@ | |||
//! These data structures are stable, meaning they may not change even in | |||
//! their layout in memory, as the raw bytes-in-memory are serialized and | |||
//! exchanged by PostgreSQL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JLockerman can correct me if I have this wrong, but this is not actually the case for these objects. The MetricSummary
is only used in the internal state, which is never persisted across version upgrades. The objects that aren't allowed to change are the pg_type
declared final objects, CounterSummary
and FlatSummary
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that would be a relief! If that's so, why do we test it? If we don't actually require stability of that serialization, we're just paying a tax every time we change it:
https://github.com/timescale/timescaledb-toolkit/blob/main/extension/src/counter_agg.rs#L972
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR: you're both partially right, the situation is confusing, and the problem goes away once we stop caring about timescaledb versions older than 2.7.0
Ah this is the confusing bit: postgres has two different methods of serialziation
varlena
types must exist as a flat memory buffer, and are written to disk in the same format as they exist in memory.- If you want to use an aggregate in parallel execution, you need to be able to serialize the transition state into some (machine-agnostic?) wire format for IPC. Notably, we used to store the output of this in continuous aggregates, though that's slated to change in timescaledb 2.7.0
@WireBaron is saying that these types aren't the first kind; they only exist in transition functions–ephemerally in memory–and therefore they can change between version. The test is checking that we can serialize this to disk stably in continuous aggregates, which is a slightly weaker invariant; we can change the in-memory representation freely as long as we can keep deserializing old partials. Hopefully we'll be able to get everyone onto the new continuous aggregates format soon, and we'll be able to stop caring about the second form of stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably this means that the following comment
//! Note that [MetricSummary] is already in violation, as it does not lock in
//! a memory representation and the Rust project makes no guarantees to
//! preserve this across releases of the compiler. We should bump its
//! serialization version and repr(C) the new one.
is incorrect: the rust-memory layout doesn't really matter here, what matters is that serde guarantees layout-stability for the serialized form of a given struct definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the incorrect leap i was making was from "we directly derive the serialization form from the in-memory layout" (true) to "we serialize the actual in-memory layout" (false); the latter necessarily leads to "we need repr(C)", it's just not so! :)
Josh points out on a call that what I'm doing here is backwards: adding a second struct from which to derive (de)serialize code from when instead I should stop deriving and implement them directly. Which I'll do when I return to this, likely next week.
Thanks!
#[derive(Clone, Debug, PartialEq, Deserialize, Serialize)] | ||
// TODO Our one serialization test (counter_byte_io) passes, but is that just luck? | ||
//#[repr(C)] | ||
pub struct MetricSummary { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love creating this new struct that only differs by bounds
being an option. Is this because you thought the original MetricSummary
wasn't something that could be changed? In this case, I'd prefer changing the original object. Failing that, maybe we could make bounds
non-public and add a bounds()
accessor that returns an option?
MetricSummary:
I64Range
empty
andinfinite
.(method
extend
already did)both
out of various places that needed accessto finite left and right.
Closes issue #386.