-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing an option for the user to decide on simplifying GADM shapes #1138
base: main
Are you sure you want to change the base?
Introducing an option for the user to decide on simplifying GADM shapes #1138
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great contribution :D added a comment, please also add a release note.
We are very close I believe :D
@@ -106,6 +106,7 @@ cluster_options: | |||
|
|||
build_shape_options: | |||
gadm_layer_id: 1 # GADM level area used for the gadm_shapes. Codes are country-dependent but roughly: 0: country, 1: region/county-like, 2: municipality-like | |||
simplify_gadm: false # When true, shape polygons are simplified else no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great @SermishaNarayana :D
What about we make this option numeric? Like we can rename it as simplify_tolerance, that by default it is 0.01 (current default value) and if the value is False or <=0, then the simplification does not occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davide-f Yes, I am planning to add it and it is already in progress. I was trying to understand the reason for the explosion of the regions_onshore.geojson file without simplification in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SermishaNarayana @davide-f thanks a lot for taking care about that!
I'm afraid that it may not work with having a numerical option to simplify due to some _simplify_polys( )
function itself...
@jome1 did a great investigation on behaviour of _simplify_polys( )
which has demonstrated that all the polygons are being simplified independently of each other. That can lead to emerging a number of "enclaves" across the border of the regions.
I suspect explosion of the polygons you observed can be related to that: once we call _simplify_polys( )
it results in emerging of large amount of such enclaves because the geometry is quite complex. The good news is that the issue is likely to be resolved with the next release of shapely
which should contain an improved simplification algorithm
So, I'd probably leave a boolean flag for now and returned to the idea to add a numeric parameter, once an advanced simplification option will be available in shapely
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it right, the problem you mention occurs when simplifying the polygon and using the _simplify_polys() function in build_shapes
. But in the current observation of the explosion of regions_onshore.geojson filesize, the issue occurs when we have the GADM simplification turned off. The code in this case skips _simplify_polys(). Is the issue somehow still related then?
Please correct me if I understood it wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SermishaNarayana you are absolutely right, that is _simplify_polys( )
which creates issues in the #1051
Have misunderstood that you are observing the file expansion in case _simplify_polys( )
is bypassed. Not sure if that's really related to #1051, then. Thank you so much for the explanation!
Just to be sure that I get the problem: you are also observing some strange geometry effects, right? If that is the case could you please post a picture there? Would be great to understand what is going on with geometries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SermishaNarayana can you share also the gadm_shapes and the whole shapes folder? simplify_polys apply on gadm_shapes first and with alternative_clustering I expected that to be used.
The output of bus_regions is regions_onshore; then other rules apply and they edit them further.
You may have found another bug later on down the chain, but it is very likely that it is not linked to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I have attached all the files in the shape folder
shape_files.zip
Also, with respect to using a numerical option for tolerance, I am planning to raise it as a separate PR. Is that alright? And also to clarify (a tolerance value of 0 has the same issues as not allowing the GADM shapes to be simplified )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gadm_shapes here are fine, so this PR should be ok. The problem is likely later on.
Let's also keep the flag true/false and open an issue for custom tolerances and minarea.
Feel free to turn this PR as ready for review whenever ready. Please add a line in release note :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davide-f Apologies, The files I had uploaded earlier were the ones with simplify_gadm option turned ON. I am attaching another set of files with simplify_gadm option turned OFF.
shape_files_non_simplified.zip
Can you please verify if these files look alright?
I shall add the release note :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! At first glance it looks good .
Here we have a great comparison and justification on why to add custom tolerances.
The feedback I see is the following:
- the number of shapes in the two dataset is the same: 55 shapes in both. This means that the shapes are conserved.
- the "simplified" one is 0.7Mb, while the non-simplified is 73 Mb. This is a good reason on why to use simplification. For some applications, however, the default value may be too much and lower simplication may be accepted.
The shapes are consistent and the bug you found later on in the simplification is not visible here. This suggests that the PR is ready to go functionality-wise :)
… into simplify_gadm
…th into simplify_gadm
Closes # (if applicable).
Changes proposed in this Pull Request
Checklist
envs/environment.yaml
anddoc/requirements.txt
.config.default.yaml
andconfig.tutorial.yaml
.test/
(note tests are changing the config.tutorial.yaml)doc/configtables/*.csv
and line references are adjusted indoc/configuration.rst
anddoc/tutorial.rst
.doc/release_notes.rst
is amended in the format of previous release notes, including reference to the requested PR.