Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wrappers for postgres geometric types. #1332

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

hamiltop
Copy link

For simple Copy types, just newtype them and add impls for FromDatum and ToDatum.

For more complex variable length types, create zero-copy and owned structs with impls for FromDatum and ToDatum as well as SqlTranslatable.


// Copy types

pub type Box = pg_sys::BOX;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In [pgrx/src/datum/date.rs] the pattern is to do a transparent wrapping struct and add From impls. If that's the preferred pattern I can switch these over to that approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give us a few days to look this over.

I don’t know what the right answer is. I think we might prefer a newtype but then that’ll cause complications over in pl/rust land for types that are already exposed, like BOX. That’s not your responsibility of course, buts it’s something we need to take into consideration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this a bit more. I think it's fine to leave these as aliases. In fact, it's probably preferred.

A newtype wrapper would imply that pgrx has some more knowledge about the type and, for example, enforces bounds or requires some invariants that we don't.

We basically had to do that for the date/time types b/c even tho the underlying type is something basic like a u32 (or whatever), not all u32s are valid dates or timestamps or whatnot. I don't guess that's the case here.

It looks like Postgres does some validation on the string representation of these types, of course, but it doesn't care to check the actual values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically had to do that for the date/time types b/c even tho the underlying type is something basic like a u32 (or whatever), not all u32s are valid dates or timestamps or whatnot.

Yes, that is exactly the case. The types have implicit invariants that aren't going to be enforced by the "raw" pg_sys side, and lifting them into Rust allowed enforcing invariants. If all pairs of points are valid BOXes (I'm talking only about Postgres's conception of validity), then it's fine to just use an alias.

Copy link
Contributor

@eeeebbbbrrrr eeeebbbbrrrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @hamiltop. This is sincerely appreciated.

In addition to the little nit about prefixing things in pg_sys with pg_sys::, if you could update the table of types in the top-level README.md, that'd be great.

Our documentation is nowhere near where we'd like it to be, but that table is a great quick reference.

Comment on lines 153 to 157
impl IntoDatum for Point {
fn into_datum(mut self) -> Option<pg_sys::Datum> {
unsafe {
let copy = PgMemoryContexts::CurrentMemoryContext
.copy_ptr_into(&mut self, std::mem::size_of::<pg_sys::Point>());
.copy_ptr_into(&mut self, std::mem::size_of::<Point>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we tend to prefix things in pg_sys with pg_sys::. I can't say we're 100% consistent with that throughout the codebase, but my preference would be to shift towards that direction, not away.

The reason is it makes it a lot easier when reading the code to know "this thing is defined by Postgres", which is almost always an important thing to know about a type when working inside pgrx.

So, if you will, please revert this particular change, and if there's anything else in the PR that elides pg_sys::, please add it.

Copy link
Author

@hamiltop hamiltop Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this isn't quite "eliding pg_sys::". This is a change from pg_sys::Point to geo::Point (which happens to be an alias to pg_sys::Point). It's especially confusing here because they are both spelled and capitalized the same way, but other cases it's Box vs pg_sys::BOX and LineSegment vs pg_sys::LSEG.

Also, this particular line could also be rewritten as std::mem::size_of::<Self>.

I do like the more normal type names, especially in place of ALL_CAPS ones, but if pg_sys:: is preferred I can make that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change from pg_sys::Point to geo::Point

hmm. yeah, it is.

Also, this particular line could also be rewritten as std::mem::size_of::.

Indeed. Probably good to make that little change.

but if pg_sys:: is preferred I can make that change.

No, this is correct. You're right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


// Copy types

pub type Box = pg_sys::BOX;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this a bit more. I think it's fine to leave these as aliases. In fact, it's probably preferred.

A newtype wrapper would imply that pgrx has some more knowledge about the type and, for example, enforces bounds or requires some invariants that we don't.

We basically had to do that for the date/time types b/c even tho the underlying type is something basic like a u32 (or whatever), not all u32s are valid dates or timestamps or whatnot. I don't guess that's the case here.

It looks like Postgres does some validation on the string representation of these types, of course, but it doesn't care to check the actual values.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split the "zero-copy" types into their own PR. We will need to evaluate them separately as they intersect with part of the MemCx project.

For simple `Copy` types, just alias them and add impls for
FromDatum and ToDatum.

For more complex variable length types, create owned structs with
impls for FromDatum and ToDatum as well as SqlTranslatable.
@hamiltop
Copy link
Author

@workingjubilee Updated with removal of zero-copy types. Should I wait for MemCx to land before re-introducing the zero-copy types? Sounds like it's close.

@workingjubilee
Copy link
Member

Not that close, apparently. ^^ Sorry for the delays.

fn into_datum(mut self) -> Option<pg_sys::Datum> {
unsafe {
let ptr = PgMemoryContexts::CurrentMemoryContext
.copy_ptr_into(&mut self, std::mem::size_of::<pg_sys::BOX>());
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here, let's enforce this style:

Suggested change
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
.copy_ptr_into(&mut self, mem::size_of::<Self>());

fn into_datum(mut self) -> Option<pg_sys::Datum> {
unsafe {
let ptr = PgMemoryContexts::CurrentMemoryContext
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
.copy_ptr_into(&mut self, mem::size_of::<Self>());

fn into_datum(mut self) -> Option<pg_sys::Datum> {
unsafe {
let ptr = PgMemoryContexts::CurrentMemoryContext
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
.copy_ptr_into(&mut self, mem::size_of::<Self>());

fn into_datum(mut self) -> Option<pg_sys::Datum> {
unsafe {
let ptr = PgMemoryContexts::CurrentMemoryContext
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.copy_ptr_into(&mut self, std::mem::size_of::<Self>());
.copy_ptr_into(&mut self, mem::size_of::<Self>());

Comment on lines +45 to +46
#[pg_test]
fn test_lseg_datum() -> spi::Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests look good, but can we also get one or two that show the same types being passed around using the fn_call module? I've noticed sometimes there's odd discrepancies between various ways of moving data in/out using literals and SPI, and in/out using other interfaces. You don't have to do all of them, just whichever type seems most quirky.


// Copy types

pub type Box = pg_sys::BOX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically had to do that for the date/time types b/c even tho the underlying type is something basic like a u32 (or whatever), not all u32s are valid dates or timestamps or whatnot.

Yes, that is exactly the case. The types have implicit invariants that aren't going to be enforced by the "raw" pg_sys side, and lifting them into Rust allowed enforcing invariants. If all pairs of points are valid BOXes (I'm talking only about Postgres's conception of validity), then it's fine to just use an alias.

}
}

pub type Point = pg_sys::Point;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All pairs of f64s are valid Points, right?

@@ -72,3 +163,214 @@ impl IntoDatum for pg_sys::Point {
pg_sys::POINTOID
}
}

pub type Circle = pg_sys::CIRCLE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same "are there any implicit invariants?" question.

@workingjubilee
Copy link
Member

A rough draft for MemCx<'mcx> landed in #1342.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants