Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not testing Geoserver on other PAVICS deployments than the production host #183

Open
tlvu opened this issue Nov 6, 2020 · 13 comments
Open

Comments

@tlvu
Copy link
Contributor

tlvu commented Nov 6, 2020

Notebook https://github.com/Ouranosinc/pavics-sdi/blob/400c0f920b307fffc984b4b97c7e8d12c371b756/docs/source/notebooks/WFS_example.ipynb hardcode http://boreas.ouranos.ca/geoserver/wfs means the hostname will not be replaced by the test suite at https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests to be able to target other PAVICS deployments. This means the Geoserver on other PAVICS deployment are not tested.

However allowing targetting other Geoserver means we have to provide test data matching the needs of the notebook WFS_example.ipynb. So this means

  • Find the smallest dataset that can demonstrate the Geoserver feature we can to showcase
  • Ensure this dataset can be distributed publicly legally
  • Provide a mechanism to distribute this dataset and load it into Geoserver without using Geoserver WebUI for test automation

FYI @matprov, @Zeitsperre, @huard

@tlvu
Copy link
Contributor Author

tlvu commented Nov 6, 2020

Related discussion bird-house/birdhouse-deploy#6 (comment)

@huard
Copy link
Contributor

huard commented Nov 6, 2020

I think there are two distinct issues that we should differentiate:

  1. Testing a new stand-alone installation
  2. Demonstrating PAVICS@ouranos

The notebooks are designed for 2. In this case, it might be better to have a synthetic dataset and associated test for 1 rather than trying to merge 1 and 2 together in a notebook.

@tlvu
Copy link
Contributor Author

tlvu commented Nov 6, 2020

I think there are two distinct issues that we should differentiate:

1. Testing a new stand-alone installation

2. Demonstrating PAVICS@ouranos

The notebooks are designed for 2. In this case, it might be better to have a synthetic dataset and associated test for 1 rather than trying to merge 1 and 2 together in a notebook.

Agreed. The advantage of mixing both together is we can test that our demo of PAVICS@ouranos still works, and also saving time from writing new test cases.

But sometime it might not be worth it.

I am open for other suggestions on how to test the Geoserver end-to-end.

tlvu added a commit to bird-house/finch that referenced this issue Apr 13, 2021
Use `boreas.ouranos.ca` instead of `pavics.ouranos.ca` for `wfs_url =
'https://boreas.ouranos.ca/geoserver/wfs'` so we do not replace it with
other hosts, due to existing issue
Ouranosinc/pavics-sdi#183.

Write `poly_file = Path('/tmp/mtl_raw.geojson')` to `/tmp/` for
notebooks works on tutorial-notebooks folder on Jupyter env that is
read-only.
tlvu added a commit to bird-house/finch that referenced this issue Apr 13, 2021
Use `boreas.ouranos.ca` instead of `pavics.ouranos.ca` for `wfs_url =
'https://boreas.ouranos.ca/geoserver/wfs'` so we do not replace it with
other hosts, due to existing issue
Ouranosinc/pavics-sdi#183.

Write `poly_file = Path('/tmp/mtl_raw.geojson')` to `/tmp/` for
notebooks works on tutorial-notebooks folder on Jupyter env that is
read-only.
tlvu added a commit to bird-house/finch that referenced this issue Apr 4, 2022
Switch from 'boreas.ouranos.ca' to 'pavics.ouranos.ca' will allow us to
conditionally regex replace 'pavics.ouranos.ca' to the server under test
so we can actually test the Geoserver on that new server.

Everywhere under this pavics-sdi repo, we use 'pavics.ouranos.ca'. This
is the only locations that hardcode 'boreas.ouranos.ca' because
geoserver data are not replicated to standard test servers so hardcoding
it this way will use the data from the production server.

This patch achieve the same "data from prod" but also allow override if
We need to actually test the future production server.

Related to Ouranosinc/pavics-sdi#183
tlvu added a commit that referenced this issue Apr 5, 2022
…w-production-server

notebooks: allow to test GeoServer on new production server and fix old .ncml path

Switch from `boreas.ouranos.ca` to `pavics.ouranos.ca` will allow us to
conditionally regex replace `pavics.ouranos.ca` to the server under test
so we can actually test the Geoserver on that new server.

Everywhere under this pavics-sdi repo, we use `pavics.ouranos.ca`.  This
is the only locations that hardcode `boreas.ouranos.ca` because
geoserver data are not replicated to standard test servers so hardcoding
it this way will use the data from the production server.

This patch achieve the same "data from prod" but also allow override if
we need to actually test the future production server.

Related to #183 and Ouranosinc/PAVICS-e2e-workflow-tests#104.

Also fix old `nrcan_v2.ncml` path in `regridding.ipynb`.
tlvu added a commit to bird-house/finch that referenced this issue Apr 5, 2022
…w-production-server

subset.ipynb: allow to test GeoServer on new production server

Switch from `boreas.ouranos.ca` to `pavics.ouranos.ca` will allow us to
conditionally regex replace `pavics.ouranos.ca` to the server under test
so we can actually test the Geoserver on that new server.

Everywhere under this finch repo, we use `pavics.ouranos.ca`. This
is the only locations that hardcode `boreas.ouranos.ca` because
geoserver data are not replicated to standard test servers so hardcoding
it this way will use the data from the production server.

This patch achieve the same "data from prod" but also allow override if
We need to actually test the future production server.

Related to Ouranosinc/pavics-sdi#183 and Ouranosinc/PAVICS-e2e-workflow-tests#104.
@tlvu tlvu changed the title Not testing Geoserver on other PAVICS deployments than Boreas Not testing Geoserver on other PAVICS deployments than the production host Sep 20, 2023
@fmigneault
Copy link
Contributor

@tlvu

However allowing targetting other Geoserver means we have to provide test data matching the needs of the notebook WFS_example.ipynb. So this means

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to bird-house/birdhouse-deploy#381?

This way, we can also validate that everything works when bird-house/birdhouse-deploy#348 is ready as well.

@tlvu
Copy link
Contributor Author

tlvu commented Sep 25, 2023

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to bird-house/birdhouse-deploy#381?

Asking our in-house Geoserver @Zeitsperre power user, can the request from @fmigneault above be done?

The issues that need to be sorted out are:

  • Ensure this dataset can be distributed publicly legally

  • Provide a mechanism to distribute this dataset and load it into Geoserver without using Geoserver WebUI for test automation

  • How big it this dataset, where to host it?

The reason being all staging and test servers of PAVICS should be able to load this Geoserver data unattended.

@fmigneault
Copy link
Contributor

Ensure this dataset can be distributed publicly legally

Note that I'm not fixed on that specific dataset if there is an issue. Anything that can be swapped for the test notebook is fine.
Though, if this one is not allowed, there would be an actual issue because this layer is already available publicly!
https://pavics.ouranos.ca/geoserver/ows?service=WFS&acceptversions=2.0.0&request=GetFeature&layers=public:canada_admin_boundaries&typeName=public:canada_admin_boundaries&bbox=-74.5,45.2,-73,46

  • Provide a mechanism to distribute this dataset and load it into Geoserver without using Geoserver WebUI for test automation
  • How big it this dataset, where to host it?

Can be a snapshot for test purposes. It does not need to be updated automatically. It can also be a subset that match the bbox area of the test notebook if the original is big. This test sample would in birdhouse-deploy as an optional-component for tests. That component could either place it in the right location in the stack and mounted in GeoServer directly (if that is sufficient/possible?), or do a one-shot docker run to post the features via GeoServer API.

@Zeitsperre
Copy link
Contributor

Zeitsperre commented Sep 27, 2023

Hi all, please excuse the radio silence, I was getting re-certified for first aid this week.

Can you extract the public:canada_admin_boundaries layer referenced in the notebook to add it to bird-house/birdhouse-deploy#381?

This way, we can also validate that everything works when bird-house/birdhouse-deploy#348 is ready as well.

Absolutely, the dataset can be found via the Canada Census Boundaries geometries. It is publicly available data under the Open Canada License. No legal distribution issues. I'll find a copy and convert it to GeoPackage or GeoJSON (anything but Shapefile).

In order to load the file into GeoServer, this project would likely be one of the better candidates (geoserver-rest). It's much more mature than it was just a few years ago. Once the data is locally available to the server/service, the library has dataset publishing functions.

Will confirm the size and report back.

Edit: The compressed dataset is around 170 MB, so we would need to host it somewhere (or fetch it on deployment? https://www12.statcan.gc.ca/census-recensement/2021/geo/sip-pis/boundary-limites/files-fichiers/lpr_000b21g_e.zip). Interestingly, StatCan offers an ESRI REST service for the layers now: https://geo.statcan.gc.ca/geo_wa/rest/services/2021/Cartographic_boundary_files/MapServer.

@fmigneault
Copy link
Contributor

The notebook only seems to query WFS in GeoJSON format and display it on a map.
If the original example is 170MB, I think it is better to find a new one and simply update the notebook.
The test layer could be anything much smaller than this.

@Zeitsperre
Copy link
Contributor

@fmigneault

In that case, literally anything in a geospatial format from here would be fine: https://www.donneesquebec.ca/recherche/dataset. Lots of options and I can add anything we'd like to the production GeoServer.

@fmigneault
Copy link
Contributor

Other alternative is to POST data on the Geoserver REST endpoint before querying it in https://github.com/Ouranosinc/pavics-sdi/blob/master/docs/source/notebooks/WFS_example.ipynb
Example notebook doing it: https://app.reviewnb.com/Ouranosinc/PAVICS-e2e-workflow-tests/pull/125/

@tlvu
Copy link
Contributor Author

tlvu commented Oct 25, 2023

@fmigneault

In that case, literally anything in a geospatial format from here would be fine: https://www.donneesquebec.ca/recherche/dataset. Lots of options and I can add anything we'd like to the production GeoServer.

Just to be clear, this is not about adding to the production GeoServer. This is about automating the data provisioning in a fresh and empty GeoServer so all testing instance of PAVICS will have the matching data for the test notebook to run.

@fmigneault
Copy link
Contributor

Yes of course. The data is necessary only for the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants