You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This idea is courtesy @ahmarsuhail.
Today PhisicalIO expands the read window based on the prior sequential read patterns. This is sensible, however when reading Parquet, pre-fetching past the boundary of a RG or a footer (for which there should be no prefetching anyway) never makes sense.
Given that we already know the size of the RG in LogicalIO, we know the upper boundary on the prefetch, reducing over-reads.
The strawman of the approach: Phisical IO allows upper bound specification for each request, and Parquet Logical IO passes it on relevant fetches.
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
Tell us more about this new feature.
This idea is courtesy @ahmarsuhail.
Today
PhisicalIO
expands the read window based on the prior sequential read patterns. This is sensible, however when reading Parquet, pre-fetching past the boundary of a RG or a footer (for which there should be no prefetching anyway) never makes sense.Given that we already know the size of the RG in LogicalIO, we know the upper boundary on the prefetch, reducing over-reads.
The strawman of the approach: Phisical IO allows upper bound specification for each request, and Parquet Logical IO passes it on relevant fetches.
Code of Conduct
The text was updated successfully, but these errors were encountered: