Skip to content

Conversation

dshean
Copy link
Member

@dshean dshean commented Jul 12, 2025

Testing with large geojson (~18000 km2), which involved 12278 tiles (2x2 km):

  • Greatly speeds up the EPT reader preparation for each tile. Still serial, but eliminates the need to loop through 2201 EPT polygons for each tile. Limits initial read of EPT polygons to the user-defined AOI polygon. Then does intersection once, and only adds valid EPT files to the reader.
  • Fix bug that only allowed 9999 pipeline json files to be created
  • Clean up import statements
  • Clean up helpstrings, description and comments
  • Clean up print statements

Several related issues require further attention, and some additional refactoring.

@dshean
Copy link
Member Author

dshean commented Jul 12, 2025

This was a branch from the osx-64_support branch. It could be merged directly onto main.
We can close and open PR directly on main.

@dshean
Copy link
Member Author

dshean commented Jul 12, 2025

Ah, the existing tiling approach uses bbox, but doesn't check for intersection with the original AOI. Produces A LOT of unnecessary tiles. See the first two tiles (southwest) here (red is AOI, green is EPT index).

image

Will push a fix for this. We should consider creating geodataframe of tile extent polygons up front, rather than iteratively creating.

@dshean
Copy link
Member Author

dshean commented Jul 12, 2025

We should also consider preserving the tile row/col in the pipeline filenames.

@dshean
Copy link
Member Author

dshean commented Jul 12, 2025

This reduces the number of readers to process from ~11600 to ~6900.

Unfortunately, it means that the tile numbers are inconsistent with the pipeline numbers. Not a showstopper, but not ideal.

I think a better approach here is to create the tile polygon index for the entire bbox, intersect with the user AOI, then store all of relevant metadata in the gdf for the tile, instead of creating and maintaining 4 additional lists for pointcloud_input_crs, readers, extents and original extents. Just return the GDF and expand elsewhere, instead of the lists. Keep the same tile ID.

Base automatically changed from osx-64_support to main July 14, 2025 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants