Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spread operator for many-to-many relationships, aliases, and aggregations #3041

Open
jdgamble555 opened this issue Nov 7, 2023 · 4 comments · May be fixed by #3640
Open

Spread operator for many-to-many relationships, aliases, and aggregations #3041

jdgamble555 opened this issue Nov 7, 2023 · 4 comments · May be fixed by #3640
Assignees
Labels
embedding resource embedding enhancement a feature, ready for implementation

Comments

@jdgamble555
Copy link

Currently the spread operator works on one-to-one and many-to-one relationships. We need to get it to work in all situations like many-to-many relationships:

Schema:

CREATE TABLE posts (
  id uuid NOT NULL PRIMARY KEY DEFAULT uuid_generate_v4(),
  title text NOT NULL CHECK (char_length(title) > 2),
  ...
)
CREATE TABLE tags (
  name text,
  pid uuid REFERENCES posts(id) ON DELETE CASCADE ON UPDATE CASCADE,
  created_at timestamptz NOT NULL DEFAULT now(),
  PRIMARY KEY (name, pid)
);

-- for later example

CREATE TABLE hearts (
  pid uuid NOT NULL REFERENCES posts(id) ON DELETE CASCADE ON UPDATE CASCADE,
  uid uuid NOT NULL REFERENCES profiles(id) ON DELETE CASCADE ON UPDATE CASCADE,
  created_at timestamptz NOT NULL DEFAULT now(),
  PRIMARY KEY (pid, uid)
);

Spread on Item

GET /posts?select=title,tags(...name)

Response:

[
  {
    "title": "post title",
    "tags": ["technology", "buddy"]
   }
]

Spread on Count

GET /posts?select=title,...hearts(hearts_count:count)

Response:

[
  {
    "title": "post title",
    "hearts_count": 10
  }
]

Sort by Foreign Column (with alias and spread)

Would help with the sorting by foreign column:

GET /posts?select=title,...hearts(hearts_count:count)&order=hearts_count.desc

Multiple Levels

Of course if you model your tags with a junction table, you need something like this that works for multiple levels:

GET /posts?select=title,...posts_tags(tags(...name))

Response:

[
  {
    "title": "post title",
    "tags": ["technology", "food"]
   }
]

Spread on Table (with duplicate column name)

I'm not sure how useful this one would be, as this may could be translated differently.

GET /posts?select=title,created_at,...tags(name,tag_created_at:created_at)

Response:

[
  {
    "title": "post title",
    "name": ["technology", "buddy"],
    "tag_created_at": ["1-1-01", "1-2-02"]
  }
]

I think the spread operator has so much potential to simplify things. It could also simplify the queries under the hood.

J

@jdgamble555
Copy link
Author

jdgamble555 commented Nov 7, 2023

Just thought of a complex example here. Imagine that you want to get all the related posts. Related posts are posts that share the same tags as the current posts sorted by relevance (number of duplicate posts found).

Imagine if you could just do this:

GET /posts?select=...tags(posts(*,count))&id=eq.0x123&posts.id=not.eq.0x123&order=count

Instead of a custom function like this:

DROP FUNCTION IF EXISTS related_posts;
CREATE OR REPLACE FUNCTION related_posts(id uuid)
RETURNS SETOF posts
LANGUAGE sql
AS $$
  SELECT p.* 
  FROM posts p
  JOIN tags t1 ON p.id = t1.pid
  JOIN tags t2 ON t1.name = t2.name 
    AND t2.pid = related_posts.id
  WHERE p.id != related_posts.id
    AND p.status = 'published'
  GROUP BY 
    p.id, 
    p.parent, 
    p.status,
    p.title, 
    p.author, 
    p.content, 
    p.image, 
    p.slug, 
    p.minutes, 
    p.created_at, 
    p.updated_at, 
    p.published_at
  ORDER BY count(*) DESC;
$$;

This would technically require you to be able to put count and another column in the same request. This is just some ideas of where postgREST could be one day.

This would also solve a lot of Group By problems:

#158

J

@steve-chavez
Copy link
Member

steve-chavez commented Nov 7, 2023

Spread on Count
GET /posts?select=title,...hearts(hearts_count:count)

@jdgamble555 Sounds like a good idea. We've been wanting this for aggregate functions #2925.

I'm thinking the rule should be to allow to-many spreads if there's only aggregates (no other columns) on the embedded relation. This to ensure there's only one row present.

GET /posts?select=title,tags(...name)
[
{
"title": "post title",
"tags": ["technology", "buddy"]
}
]

To be consistent with the current syntax:

GET /posts?select=title,...tags(name)

The restriction for to-many spreading for this case would be to have a single column on the embedded relation.

Sort by Foreign Column (with alias and spread)
Would help with the sorting by foreign column:

Looks related to #2730. If we solve spread on aggregates I believe this will come for free.

Spread on Table (with duplicate column name)
I'm not sure how useful this one would be, as this may could be translated differently.
GET /posts?select=title,created_at,...tags(name,tag_created_at:created_at)
[
{
"title": "post title",
"name": ["technology", "buddy"],
"tag_created_at": ["1-1-01", "1-2-02"]
}
]

That one doesn't seem that useful. The user would have to match both arrays to get the complete data and we'd need to ensure the order somehow. Seems hard to implement too.


Not sure yet of the shape of the queries.

@steve-chavez steve-chavez added the idea Needs of discussion to become an enhancement, not ready for implementation label Nov 7, 2023
@wolfgangwalther wolfgangwalther changed the title FEATURE REQUEST - spread operator for many-to-many relationships, aliases, and aggregations Spread operator for many-to-many relationships, aliases, and aggregations Nov 7, 2023
@philmetzger
Copy link

I would love this to be a thing!
Have the same issue where i want to do a filter for overlap on a joined tags table.

@laurenceisla laurenceisla self-assigned this Jul 4, 2024
@laurenceisla laurenceisla added enhancement a feature, ready for implementation embedding resource embedding and removed idea Needs of discussion to become an enhancement, not ready for implementation labels Jul 4, 2024
@laurenceisla
Copy link
Member

Not sure yet of the shape of the queries.

I'm giving this a shot in #3640. My approach is to wrap all the selected columns in json_agg(). This would allow us to select more than one column in the embedded spread. For example, this request:

curl 'localhost:3000/clients?select=name,...projects(name,id)'

Would roughly translate to:

SELECT "test"."clients"."name",
       "clients_projects_1"."name",
       "clients_projects_1"."id"
FROM "test"."clients"
LEFT JOIN LATERAL (
  SELECT json_agg("projects_1"."name") AS "name",
         json_agg("projects_1"."id") AS "id"
  FROM "test"."projects" AS "projects_1"
  WHERE "projects_1"."client_id" = "test"."clients"."id"
) AS "clients_projects_1" ON TRUE

Then, the response would be:

[
  {"name":"Microsoft","name":["Windows 7", "Windows 10"],"id":[1, 2]},
  {"name":"Apple","name":["IOS", "OSX"],"id":[3, 4]}
]

I'll be updating any caveats I encounter in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embedding resource embedding enhancement a feature, ready for implementation
Development

Successfully merging a pull request may close this issue.

4 participants