Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move limit in getFilesAndIndicesForDimensions to time dimension #409

Open
mgrunbauer opened this issue Oct 9, 2024 · 0 comments
Open

Move limit in getFilesAndIndicesForDimensions to time dimension #409

mgrunbauer opened this issue Oct 9, 2024 · 0 comments

Comments

@mgrunbauer
Copy link
Collaborator

In #361, a PostgreSQL query in the function getFilesAndIndicesForDimensions was rewritten to this format:

select distinct t1.path,member,dimmember,height,dimheight,time,dimtime from
    t20240301t095153031_xkq9pp0nv3kqmonttvhf t1
inner join t20240301t095153038_hqug9msl1wgxpew4yozf t2
    ON t1.path = t2.path
inner join t20240301t095153043_oe2pyeutb8pdrvb7gu7r t3
    ON t1.path = t3.path
where
    member = 'member4' and
    t1.adaguctilinglevel != -1 and
    abs(5000 - height) = (select min(abs(5000 - height)) from t20240301t095153038_hqug9msl1wgxpew4yozf) and
    time = '2017-01-01T00:15:00Z'
order by member,height,time
limit 512

The original goal of having a limit at all was to prevent Adaguc from returning data for a very large time period. However the limit 512 now limits all the results from the query.

If we now perform an EDR cube call on a custom dimension (Ensemble data with members=* or statistical data with percentile=*), EDR will respond with a limited set.

The result of the query is time steps * custom dimension steps:

  • With 144 timesteps, you can only query 3 percentiles, because 144 * 3 = 432, which fits inside 512.
  • Querying 4 percentiles results in reduced timesteps, because 144 * 4 = 576, which is more than 512.

We should rewrite the query, for example by using a subquery on the time dimension, with the limit, in the inner join. Something like the following (untested):

select distinct t1.path,member,dimmember,height,dimheight,time,dimtime from
    t20240301t095153031_xkq9pp0nv3kqmonttvhf t1
inner join (SELECT * FROM t20240301t095153038_hqug9msl1wgxpew4yozf t2 WHERE ... LIMIT 512)
    ON t1.path = t2.path
inner join t20240301t095153043_oe2pyeutb8pdrvb7gu7r t3
    ON t1.path = t3.path
where
    member = 'member4' and
    t1.adaguctilinglevel != -1 and
    abs(5000 - height) = (select min(abs(5000 - height)) from t20240301t095153038_hqug9msl1wgxpew4yozf) and
    time = '2017-01-01T00:15:00Z'
order by member,height,time

Testing and verifying this functionality is easiest after custom dimension support is added to EDR cube.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant