Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up and test drop rate logic #120

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from
Draft

Conversation

e-n-f
Copy link
Collaborator

@e-n-f e-n-f commented Jul 17, 2023

No description provided.

@e-n-f e-n-f force-pushed the clean-up-feature-dropping branch from 13be7f7 to c3086ea Compare July 17, 2023 23:30
e-n-f and others added 25 commits July 17, 2023 16:33
* Current broken behavior

* More blatant test

* Fix hash collision in string pool

* Update version and changelog
* Start to distinguish fixed cluster density setting from as-needed density

* Make consistent {drop,coalesce}-densest decisions between zooms

* Actually track the previous index instead of just intending to

* Clean up collinearities in coalesced features

* To determine densest, look at actual physical distance, not just index

* Don't actually need the previous index in serial_feature now

* Center of mass of one feature to most distant point of the next

* Add apologetic comment

* Wait, how did the tests pass before?

* Revert "Wait, how did the tests pass before?"

This reverts commit f73c8ee.

* Add --maximum-string-attribute-length option

* Update version and changelog

* A little more testing to make sure
* Reviving multi-source-tile overzoom: the clip.cpp side

* Reviving multi-source-tile overzoom: the overzoom.cpp side

* Update readme

* Update version and changelog
* Track output position at the file level instead of within each tile

* Track file position where the child tile data begins

* Add option and document its intended behavior

* Changing the detail loop to account for stopping early

* I forgot I already added an option for this

* Stop early if we can make a complete tile

* Add a test of zoom truncation with limited feature count

* Forgot to commit the actual code change

* Make room for a vertex count in the header of each serialized tile

* Estimate tile complexity; don't try truncating when unlikely to work

* Be more conservative, because ever retrying a tile is a big speed hit

* If stopping early, don't simplify or clean; leave that to overzoom

* Add tiny polygon reduction / dust to overzoom

* Don't try to stop early in the children if we dropped anything by rate

* Fflush here too before pwriting

* Don't stop early if we ended up dropping any features.

Rework the can-the-next-zoom-stop-early logic to avoid going
one zoom further than needed.

* Fix warning

* Fix warnings

* Oops, checking for the wrong expected return value

* Cleanup from adding line simplification in overzoom

* Current (wrong) behavior when combining coalescing and truncating

* Keep a list of parent tiles to skip rather than truncating

* Now the coalesced tiles in z12 get children in z13

* Don't double-count feature dropping when the zoom level is retried

* Correct README description

* Remove todo about special case below basezoom, which is accounted for

* Be a little more aggressive in drop-densest determination

* Scale tile feature limit for megatiles in the same way as byte limit

* Fully deprecate -detect-shared-borders into an alias

* Track the distances found in the douglas-peucker recursion

* Serialize and deserialize the distance with the vertices

* Revert "Serialize and deserialize the distance with the vertices"

This reverts commit 753f1b7.

* Revert "Track the distances found in the douglas-peucker recursion"

This reverts commit e5361f8.

* Revert "Fully deprecate -detect-shared-borders into an alias"

This reverts commit 0698aeb.

* Better tracking of whether we failed to make a full-detail tile

* Put a bloom filter in front of the binary search for shared nodes

* Forgot to take out this printf

* Improve dispatch of tiling tasks

* Still dispatch the biggest tasks first

* Track zoom truncation in the strategies list in the tileset metadata

* Prescan for small deltas before doing proper simplification

* Revert "Prescan for small deltas before doing proper simplification"

This reverts commit d1d8238.

* Update version and changelog

* Rename to --generate-variable-depth-tile-pyramid
#254)

* Fix latitude bboxes for features that extend beyond the mercator plane

* Update some more test expectations

* Update version and changelog
* Clip before dealing with multiplier or filters in overzoom

* Be more careful to retry when the feature count is exceeded

* Adjust the estimated total feature count for the multiplier too

* Fix the feature count estimates, I think

* Pass build info into the version string

* Report the actual max zoom of any tiles as the metadata maxzoom

* Revert unneeded renaming to make the diff more readable

* Clean up the adjustments to tile sizes and feature counts

* Update version and changelog

* Dropping a feature into a multiplier cluster still effectively drops it

* Update changelog

* Rethink the changelog description

* Don't try to truncate zooms if we are still tiling at z18
* Replace build-essential in Docker builder image

build-essential is overkill, it pulls in all of the tools one needs to
make to build debian packages, which we are not doing. Even though this
is a builder image, it still takes time to download all the extra crud.

Replace build-essential with make gcc and g++, which are all that are
needed.

* Replace dev packages in final Docker image

The dev packages are only needed in the builder image. They add header
files and docs and things that are not needed in the final image.

Removing them and replacing them with just the runtime libraries reduces
the final image size by about 25% (160M->124M) in a local build.

Additionally, just for simplicity, zlib1g is already part of base
ubuntu minimal, so we don't even have to list it.

* Remove build-essential mention from the README

Just like the in Dockerfile, people don't need all of those packages
to build directly on their system.
* Factoring out tilestats management from GeoJSON file reading

* Move code around so overzoom can link against parse_layers

* Read the file of bins

* Plumb the bins through to overzoom()

* Some zip code bins to test with

* (Currently non-functional) test of binning

* Starting to spell out the bin matching loop

* Can't flatten points, so don't flatten bins either

* More fleshing out bin traversal

* Bounding box of tile-relative mvt geometry

* Smallest enclosing tile from bbox

* Most of the bin scan

* Add point in polygon check. It crashes.

* Find the matching bins

* GDAL-style bounding boxes have eaten my brain

* Make some features to bin into

* Increment a count as features are found to be within the bins

* Fix longitude wraparound in overzoom bins

* Fix the tests

* Push off attribute copying until after bin assignment

* Carry sum of numeric attributes into the bins

* Also add mean, min, and max

* Add --calculate-feature-index since I keep needing it for testing

* Add an option to accumulate sum/mean/max/min/count of all numeric attrs

* Don't bake in tippecanoe:mean, since we redo it from sum and count

* Forgot to update this test fixture after removing tiled mean

* Update version and changelog
…on (#260)

* Fix bad interaction between dynamic dropping and limiting by truncation

* Don't enforce the tile size limit here if they said not to
* Another try at fixing longitude wraparound for bins

* Gonna get it right this time

* Forgot to update the comment

* The filter case was not supposed to reinterpret geometry

* Copy antimeridian-crossing geometries to the other side too

* Getting closer to getting antimeridian-crossing polygons right

* Update changelog and version
* Pass bin IDs through to the output

* Update version and changelog
* Plumb bounding boxes through potential intersections

* Quick bbox reject for bins that can't possibly intersect

* Inching toward attribute accumulation in megatile handling

* Some sort of test for how all these things interact with each other.

Automatic numeric attribute accumulation does *not* apply to attributes
that have an explicit attribute accumulator set, because the order of
operations is too messy and weird

* More sketching

* More sketching

* Actually do some accumulation

* Put all that behind an --accumulate-numeric flag

* Use the same attribute accumulation logic in binning as in megatiles

* Fix backwards conditional

* Add means, but somehow I have some counts of 0

* Handle aggregated attributes with no base attribute in the feature

* Checkpoint before I break everything

* Found a flaw, now to debug

* Fix a typo that broke accumulation

* Add binning tests

* Make sure IDs make it through on the bins

* Fix count/mean accumulation

* Make the numeric accumulation prefix configurable

* Make sure the accumulate test still works with a different prefix

* Forgot to update this test

* More testing to make sure cluster sizes make it all the way through

* Fix neglected --accumulate-attribute when binning

* Mark unexercised attribute accumulation cases as "can't happen"

* Factor out numeric preservation

* Attrs with the accumulation prefix are just preserved, not accumulated

* Test behavior of prefixed attributes

* Plumbing for exclude and exclude-prefix

* Implement and test attribute prefix stripping in overzoom

* Update version and changelog

* For debugging, make an attribute list of source feature IDs

* Revert "For debugging, make an attribute list of source feature IDs"

This reverts commit 65fc99c.
* Fix count accumulation in overzoom

* Add test

* Increment version and changelog
* Bin more aggressively if a point doesn't meet the pnpoly test

* Don't clip the points if we are binning

* Fix output of features added to the bin after its closure

* Update tests

* Update version and changelog
* Remove buggy optimization to avoid reclipping in overzoom

* Add clarifying comment
* Binning by ID

* Add test of binning by ID
…oms (#280)

* Choose the megatile features from those that will be in the next N zooms

* Take fractional zooms into account in multiplier feature choices

* Fix more tests

* Add a flag to retain multiplier features by minimum distance

* Limit feature expansion from multiplier density to 2x

* The multiplier cap was a bad idea

* Revert "The multiplier cap was a bad idea"

This reverts commit 6f8273a.

* Revert "Limit feature expansion from multiplier density to 2x"

This reverts commit a26e413.

* Revert "Add a flag to retain multiplier features by minimum distance"

This reverts commit 01f14a4.

* Remove the multiplier sequence, which should no longer matter

* Revert "Revert "Add a flag to retain multiplier features by minimum distance""

This reverts commit 776da4a.

* Revert "Revert "Limit feature expansion from multiplier density to 2x""

This reverts commit 44a683d.

* Revert "Revert "The multiplier cap was a bad idea""

This reverts commit 80f7cb1.

* Track two kinds of previous index for next_feature

* Fix multiplier density threshold, I think

* Oh, I didn't git add the code changes

* Update version and changelog

* Try to install sqlite3 to fix the automated build

* Deleted too much

* Only let --preserve-point-density-threshold shift density around

* Remove the density debt concept, since it doesn't help

* Make the drop states a vector instead of an array

* Revert "Make the drop states a vector instead of an array"

This reverts commit 66c7abb.

* Revert "Remove the density debt concept, since it doesn't help"

This reverts commit 707bb0c.

* Revert "Only let --preserve-point-density-threshold shift density around"

This reverts commit ecf01f2.
youngpm and others added 12 commits November 1, 2024 12:27
* Add an all-mvt_value attribute accumulation path

* Only bin by ID, not geometrically

* A little cleanup; changelog and version; test

* Remove accidental double-conversion

* Replace duplicated code with template

* Update changelog
* Progress on plumbing a string pool for full_keys through

* More plumbing for key_pool

* Don't keep features with identical locations as multiplier features

* Revert "Don't keep features with identical locations as multiplier features"

This reverts commit 413f0c8.

* Adjust calculated maxzoom to account for duplicate feature locations

* Update changelog and version

* Add a test affected by the maxzoom change with duplicate locations

* Round the drop rate a little for cross-platform test consistency
…performance (#292)

* Add a tippecanoe-decode option to restrict which attributes to decode

* Plumb buffer and feature limit around

* Check the feature limit

* Clarifying cases where output detail can be unspecified

* Clip bins to the tile buffer instead of just passing them through

* Add missing include

* Missed some tests

* Add --no-tile-compression option to tippecanoe-overzoom

* Update version and changelog
* Avoid crash if the first bin gets clipped away

* Add test

* Update version and changelog
* Strip out unwanted attributes earlier in the process

* Skip aggregations whose attributes have been excluded

* Forgot one

* Add some more tests

* Update version and changelog
* Raise tippecanoe-decode tile size limit to 250 MB

* Update version and changelog
* Plumb a clip bounding box around through overzoom

* Actually do some clipping

* Add a test

* Fix post-binning clipping

* Factoring out geometry parsing from feature parsing

* Accept a clip polygon argument to tippecanoe-overzoom

* Progress in the direction of polygon clipping

* Fix the wagyu flags. We need intersection, not union

* Remove debug spew

* Clip points to polygon bounds too

* Copy the geometric binning code to serve as intersection-finding code

* Add clipper2 for linestring clipping

* Compiles, but does not actually seem to clip. Hmm.

* Oh, it helps if I actually call the function

* Add clipping tests

* Add missing fixture, and don't crash if it is missing

* Remember to do polygon clipping after binning too

* Fix scaling before post-binning clipping. Add test.

* Remove unused parts of clipper

* Rename for consistency

* Revert accidentally added line

* Clip the clip regions to the tile bounds to reduce their complexity

* Add a test of clipping the clip region down to the tile boundary

* Update version and changelog
* Make tippecanoe-overzoom accept filters from a file

* Accept clip polygons from a file too

* Add test of clipping by polygon from file

* Add a test of reading an overzoom filter from a file

* Update version and changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants