Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres Materialized CTE functions #354

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

joelmccracken
Copy link

@joelmccracken joelmccracken commented Mar 7, 2023

Adds withMaterialized function to support this non-standard postgres behavior.

Before submitting your PR, check that you've:

After submitting your PR:

  • Update the Changelog.md file with a link to your PR.
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts).

@joelmccracken
Copy link
Author

fixes #345

@joelmccracken joelmccracken marked this pull request as ready for review March 8, 2023 04:45
@joelmccracken joelmccracken changed the title [WIP] Postgres Materialized CTE functions Postgres Materialized CTE functions Mar 8, 2023
@joelmccracken joelmccracken force-pushed the jnm/add-cte-materialized-to-postgres branch from aeef50c to f65a42d Compare March 8, 2023 04:46
@belevy
Copy link
Collaborator

belevy commented Mar 8, 2023

Is there a way that we can avoid more postgres specific stuff in the internal module? Since NOT MATERIALIZED comes after the AS could we just prepend it to the query? Perhaps we just need a more generic type CTEModifier = IdentInfo -> TLB.Builder

@joelmccracken
Copy link
Author

we could somehow insert it? the key part i think is:

    cteClauseToText (CommonTableExpressionClause _ cteMat cteIdent cteFn) =
        first
            (\tlb -> useIdent info cteIdent <> " AS " <> materializedText cteMat <> parens tlb)
            (cteFn info)

and the materializedText cteMat is the offending portion; the thing i'm really just unsure of is how to "insert" something there without like having an parameter to CommonTableExpressionClause which is like afterAsModifier which would let something else also insert something after the AS before the subquery. If you prefer that, i'll do it, It just also feels inelegant to me =(

@belevy
Copy link
Collaborator

belevy commented Mar 9, 2023

Yah so the whole point is to keep that modifier text open for extension. Its obviously less safe from an implementation perspective but who is to say that MySQL won't come out with their own custom modifiers. You have a couple of options here that are all pretty distateful each in their own way.

  1. Strict seperation of the SqlQuery type by backend. Each backend has a completely unique SqlQuery type, this is then the subject of typeclasses to support code reuse. This is trading strict correctness checking in the types for potential duplication in the implementation as much of the query language is standardized with a few changes.

  2. Parameterize a SqlQuery by its constituent parts (i.e. data SqlQuery cteClause fromClause whereClause ... =). This has the nice effect of letting you ignore the differing parts when they don't matter. It however still has some major duplication issues that show up the minute you need to make a custom clause type and for total safety you will end up needing to create custom clauses for each part.

  3. Create a single unified type that has all of the cases for all of the backends baked in, this is what esqueleto generally has done in the past. You have a single sum type that needs to handle all of the different possible backends. This is super simple but has a nasty downside in that it drags everything into the internal module, since the type is not open for extension from a different module we end up needing to modify the mega sum type, this also means that you don't really have safety to prevent mixing statements that are specific to different backends and also means you have to rely on all of the backends always (it is currently impossible to break esqueleto-core, esqueleto-postgres, esqueleto-mysql, and esqueleto-sqlite out into independent packages).

  4. The approach that I have been tending towards for use in libraries lately which is something akin to a monomorphic tagless final. In this approach there is a single SqlQuery type that is the final representation of a query (or near enough) and has extension points where the backends differ. This way we can share almost all of the code and only have to change exactly what differs, this still requires the internal module to know about how backends might wish to customize the core type but it frees us of the compile time dependency. This approach is very much like OOP where you can envision all of the functions producing SqlQuery as classes that implement a SqlQuery interface ala the interpreter pattern. This has a nasty downside of removing compile time help from the implementer but I have felt that it ended up making far simpler code when I switched the Experimental module to this style.

So to answer your question, yes leaving the modifier as a free for all is kind of icky but the end user won't see it only the implementer.

@joelmccracken joelmccracken force-pushed the jnm/add-cte-materialized-to-postgres branch from f65a42d to daec0db Compare March 17, 2023 23:22
@joelmccracken
Copy link
Author

@belevy thank you for explaining. I think I got it, and have addressed your comments.

Let me know if there is anything else to address!

@joelmccracken
Copy link
Author

I ran stylish-haskell on the files that were modified in this PR so I could mark off that check list item, just in case it caused anyone pause. For reference, however, it failed for me on Internal.hs.

for f in $(git diff --name-only origin/master | grep '.hs'); do stack exec stylish-haskell -- -i $f || continue ; done
src/Database/Esqueleto/Internal/Internal.hs: RealSrcSpan SrcSpanPoint "src/Database/Esqueleto/Internal/Internal.hs" 427 1: lexical error in string/character literal at character 's'

Copy link
Collaborator

@belevy belevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for moving the postgresql specifics out of Internal. I feel bad requesting more changes, especially after how long I took reviewing. I think we are getting close, perhaps @parsonsmatt has opinions but I feel like with the exception of the few notes I made it seems pretty good.

src/Database/Esqueleto/Internal/Internal.hs Outdated Show resolved Hide resolved
-- [Common Table Expression Materialization](https://www.postgresql.org/docs/14/queries-with.html#id-1.5.6.12.7).
--
-- /Since: 3.5.10.0/
withMaterialized :: ( ToAlias a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to also support non materialized so perhaps the materialization modifier can be passed in as an argument and we can provide helpers materialized and notMaterialized to safely select from. In fact, the more general versions could be in the core module and then whether we export with as withInternal noModifier or withInternal is whether you use with from the core module or with from this module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here was what I was thinking with this PR:

  • with, in Database/Esqueleto/Experimental/From/CommonTableExpression.hs, is
    usable for any of the sql backends; it doesn't do anything with materialized.
    • If a postgres user wants to just use regular with CTE and have postgres
      figure out if it wants to materialize it or not, they can just use with
      from the normal module.
  • the MATERIALIZED modifier is only supported in postgres, so the ability to
    specify this behavior is in the pg specific module.
    • If a pg user wishes to force pg to materialize the cte, they must import the
      pg module and use withMaterialized
  • But, besides MATERIALIZED, there is also a NOT MATERIALIZED possible
    modifier, which I have not added support for in this version . I can add
    it if desired, however, but I'll discuss why I didn't do it below.

Given that, when you say:

We probably want to also support non materialized

Are you referring to when no materialization modifier is specified, or when
the NOT MATERIALIZED modifier is intended to be used? Either way is fine; it
makes sense to include something that would do NOT MATERIALIZED, for
completeness.


On the topic of supporting the NOT MATERIALIZED modifier, here is what my
thought process was:

I saw the following function pattern when investigating this feature:

  • with
  • withRecursive

To me, this implied I should add these functions:

  • withMaterialized
  • withMaterializedRecursive

But, for NOT MATERIALIZED, following the pattern would result in even more
functions (and this just seemed like too many):

  • withNotMaterialized
  • withNotMaterializedRecursive

So, I didn't add them; totally fine with adding them, I just want to make sure
we're talking about the same thing.


Also, I just re-examined the docs for pg 12-15, and I notice that recursive CTEs are
always "materialized", so withMaterializedRecursive and
withNotMaterializedRecursive are superfluous. FYI, according to my tests
with pg 15, all of these combinations actually function, even if they all
actually behave in the same way.


So, given the above, I think the API should be one of these two choices:

  • first scenario, I delete withMaterializedRecursive and add withNotMaterialized
  • second scenario, add modifiers as arguments
    • pg module would have
      • with function having two arguments
        • first argument is Maybe Materilalization
          • data Materilization = Materialized | NotMaterialized
          • So three possible settings,
            • None for providing no materialized modifier
            • Just Materialized for providing MATERIALIZED modifier
            • Just NotMaterialized for providing NOT MATERIALIZED modifier
        • the second argument is query, as it is today
      • withRecursive is deleted
    • the regular with function should have the same API as today in Experimental.From.CommonTableExpression.

Thoughts?

Copy link
Author

@joelmccracken joelmccracken Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@belevy I went ahead and made this change anyway; to me the alternate implementation seems more confusing. So for users of the library, if they want to use regular CTE, they use with from esqueleto/src/Database/Esqueleto/Experimental/From/CommonTableExpression.hs.

If they want the the postgresql specific behavior, they use withMaterialized and withNotMaterialized from esqueleto/src/Database/Esqueleto/PostgreSQL.hs

But if you do want it to change, please let me know; i'd like to get this done sooner rather than later

@joelmccracken joelmccracken force-pushed the jnm/add-cte-materialized-to-postgres branch from 7b2d34d to b0ebe5d Compare July 27, 2023 22:01
@joelmccracken joelmccracken force-pushed the jnm/add-cte-materialized-to-postgres branch from b0ebe5d to 07dd001 Compare January 19, 2024 22:10
@joelmccracken joelmccracken force-pushed the jnm/add-cte-materialized-to-postgres branch from 07dd001 to 4a56932 Compare February 15, 2024 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants