Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve current limitations executing nested subqueries in WHERE clause with correlated conditions referencing columns from grandparents #24500

Open
psantos-denodo opened this issue Feb 5, 2025 · 0 comments

Comments

@psantos-denodo
Copy link

Although Presto is able to execute qureies containing subqueries in the WHERE clause it has some limitations.
In particular, it seems to be limitations using nested subqueries in WHERE clause with correlated conditions referencing columns from grandparents. Examples:

-- works
SELECT * FROM tpch.sf1.orders o
WHERE totalprice  = (SELECT max(totalprice) FROM tpch.sf1.orders o2 
                                 WHERE o2.orderkey = (select max(l.orderkey) from sf1.lineitem l));

-- works with low performance
SELECT * FROM tpch.sf1.orders o
WHERE totalprice  = (SELECT max(totalprice) FROM tpch.sf1.orders o2 
                    WHERE o2.orderkey = 
                          (select max(l.orderkey) from sf1.lineitem l
                           where l.shipdate > o2.orderdate));
                          
-- Correlated condition referencing a column 2-levels above: Fails with error "Given correlated subquery is not supported"
SELECT * FROM tpch.sf1.orders o
WHERE totalprice  = (SELECT max(totalprice) FROM tpch.sf1.orders o2 
                    WHERE o2.orderkey = 
                          (select max(l.orderkey) from sf1.lineitem l
                           where l.shipdate = o.orderdate));

-- Correlated condition referencing a column 2-levels above and a column 1 level above: Fails with error "Unexpected UnresolvedSymbolExpression in logical plan"
SELECT * FROM tpch.sf1.orders o
WHERE totalprice  = (SELECT max(totalprice) FROM tpch.sf1.orders o2 
                    WHERE o2.orderkey = 
                          (select max(l.orderkey) from sf1.lineitem l
                           where l.shipdate = o.orderdate and l.shipdate = o2.orderdate));

Expected Behavior or Use Case

The queries that fail should work

Presto Component, Service, or Connector

The examples are using the tpch connector but It seems to be independent of the connector, it is possible to experience the same with the hive connector.

Context

Tools that rely on Presto to execute certain queries do not have a clear way to identify these not supported scenarios to decide to send the query to Presto or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant