Summary
Note: filed as a "question" rather than a bug since I am not including a working reproduction.
V5 splits some foreign-key joins into multiple SQL queries, producing SQL similar to the following in the subfield query:
select
__result__.*
from (
select
ids.ordinality - 1
as idx,
(ids.value->>0)::"int8"
as "id0"
from
json_array_elements($1::json) with ordinality
as ids
)
as __identifiers__,
lateral (
select
__table__.id::text
as "0",
__table__."data"::text
as "1",
__identifiers__.idx
as "2"
from
"public"."ldata_node_event_ingress"
as __table__
where
(__table__."id" = __identifiers__."id0")
) as __result__;
For contrast, here is roughly what V4 produced to query the same data:
select to_json(
(
json_build_object(
'__identifiers'::text,
json_build_array(
(
(__local_1__."id")::numeric
)::text
),
'id'::text,
((__local_1__."id"))::text,
'@table'::text,
(
select json_build_object(
'__identifiers'::text,
json_build_array(
(
(__local_2__."id")::numeric
)::text
),
'id'::text,
((__local_2__."id"))::text,
'@data'::text,
(
(
(__local_2__."data")
)::text
)
) as object
from "public"."table" as __local_2__
where (
__local_1__."table_id" = __local_2__."id"
) and (TRUE) and (TRUE)
)
)
)
) as "@tables"
from
"public"."top_level_table" as __local_1__
The former query is much slower when RLS is involved if public.table is large as the planner creates a plan with a very bad bad estimate for the number of rows that will pass the filter. The plan usually looks like this:
Hash Join (cost=... rows=75 width=...) (actual time=... rows=0 loops=1)
Hash Cond: (__table__.id = ((ids.value ->> 0))::bigint)
-> Seq Scan on table __table__ (cost=... rows=1000000 width=...) (actual time=... rows=0 loops=1)
Filter: (table_rls(...))
Rows Removed by Filter: 1500000
-> Hash (cost=1.00..1.00 rows=100 width=40) (never executed)
-> Function Scan on json_array_elements ids (cost=0.00..1.00 rows=100 width=40) (never executed)
Note that the estimate for the Seq Scan is extremely bad. It assumes most of the table is visible, but RLS will actually dump it.
This might be exacerbated by the poor estimate for the cardinality of ids as the plan above was made with ids := [null]. The cardinality of json_array_elements is always estimated as 100 regardless of input
In V4, the plan is much better. I'm just guessing but this might be because the planner joins on the ID directly, so the cardinality of the join is not obscured by the json_array_elements function. Here is an outline of the plan:
SubPlan 1
-> Index Scan using table_pkey on table __local_2__ (cost=... rows=1 width=...)
Index Cond: (id = __local_1__.table_id)
Filter: (table_rls(...))
Workarounds
Our workaround right now is to inject comments for the pg_hint_plan extension into the generated SQL, but this needs manual intervention and requires a patched postgraphile. This might be an extremely rare edge case, but I'd like to document it just in case somebody else runs into similar issues.
It should also be possible to adjust the RLS policy to call a function that properly specifies a rows estimate/selectivity, but this requires a custom C extension to define a planner support function. We use AWS RDS and can't do this.
Additional context
I can provide actual table definitions, statistics, and explain plans privately if it would help. Unfortunately our schema is massive and I cannot spare the time coming up with a small reproduction example right now.
I think it should be possible to figure out a reproduction based on just the information above. The key details appear to be:
- A join which splits into a two queries, with a lateral join on
json_array_elements
- A large table with an RLS function
Summary
Note: filed as a "question" rather than a bug since I am not including a working reproduction.
V5 splits some foreign-key joins into multiple SQL queries, producing SQL similar to the following in the subfield query:
For contrast, here is roughly what V4 produced to query the same data:
The former query is much slower when RLS is involved if
public.tableis large as the planner creates a plan with a very bad bad estimate for the number of rows that will pass the filter. The plan usually looks like this:Note that the estimate for the
Seq Scanis extremely bad. It assumes most of the table is visible, but RLS will actually dump it.This might be exacerbated by the poor estimate for the cardinality of
idsas the plan above was made withids := [null]. The cardinality ofjson_array_elementsis always estimated as100regardless of inputIn V4, the plan is much better. I'm just guessing but this might be because the planner joins on the ID directly, so the cardinality of the join is not obscured by the
json_array_elementsfunction. Here is an outline of the plan:Workarounds
Our workaround right now is to inject comments for the
pg_hint_planextension into the generated SQL, but this needs manual intervention and requires a patched postgraphile. This might be an extremely rare edge case, but I'd like to document it just in case somebody else runs into similar issues.It should also be possible to adjust the RLS policy to call a function that properly specifies a
rowsestimate/selectivity, but this requires a custom C extension to define a planner support function. We use AWS RDS and can't do this.Additional context
I can provide actual table definitions, statistics, and
explainplans privately if it would help. Unfortunately our schema is massive and I cannot spare the time coming up with a small reproduction example right now.I think it should be possible to figure out a reproduction based on just the information above. The key details appear to be:
json_array_elements