Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raven notebooks failure after Geoserver upgrade #410

Closed
tlvu opened this issue Oct 21, 2021 · 10 comments
Closed

Raven notebooks failure after Geoserver upgrade #410

tlvu opened this issue Oct 21, 2021 · 10 comments

Comments

@tlvu
Copy link
Contributor

tlvu commented Oct 21, 2021

Description

Since our notebooks are unable to test Geoserver from a dev machine (issue Ouranosinc/pavics-sdi#183), this error is caught only after a go-live on prod of PR bird-house/birdhouse-deploy#136.

Steps to Reproduce

http://jenkins.ouranos.ca/job/PAVICS-e2e-workflow-tests/job/master/1277/consoleFull

There are more errors, but they mostly are variations of the following errors.

03_Extract_geographical_watershed_properties.ipynb

12:49:51  _ raven-master/docs/source/notebooks/03_Extract_geographical_watershed_properties.ipynb::Cell 1 _
12:49:51  Notebook cell execution failed
12:49:51  Cell 1: Cell execution caused an exception
12:49:51  
12:49:51  Input:
12:49:51  select_resp = wps.hydrobasins_select(
12:49:51      location="-71.291660, 50.492758", 
12:49:51      aggregate_upstream=False
12:49:51  )
12:49:51  
12:49:51  # Get GeoJSON polygon of the delineated watershed.
12:49:51  # We can either get links to the files stored on the server, or get the data directly.
12:49:51  
12:49:51  # Get the links
12:49:51  feature_url, upstream_basins_url = select_resp.get(asobj=False)
12:49:51  print("This is the geoJSON file that can be used as the basin contour in other toolboxes:")
12:49:51  print(feature_url)
12:49:51  print("")
12:49:51  
12:49:51  # Get the data directly
12:49:51  feature, upstream_basins = select_resp.get(asobj=True)
12:49:51  
12:49:51  Traceback:
12:49:51  
12:49:51  ---------------------------------------------------------------------------
12:49:51  ProcessFailed                             Traceback (most recent call last)
12:49:51  /tmp/ipykernel_271/3471614810.py in <module>
12:49:51        8 
12:49:51        9 # Get the links
12:49:51  ---> 10 feature_url, upstream_basins_url = select_resp.get(asobj=False)
12:49:51       11 print("This is the geoJSON file that can be used as the basin contour in other toolboxes:")
12:49:51       12 print(feature_url)
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/birdy/client/outputs.py in get(self, asobj)
12:49:51       38         if not self.isSucceded():
12:49:51       39             # TODO: add reason for failure
12:49:51  ---> 40             raise ProcessFailed("Sorry, process failed.")
12:49:51       41         return self._make_output(asobj)
12:49:51       42 
12:49:51  
12:49:51  ProcessFailed: Sorry, process failed.

Traceback (most recent call last):
  File "/opt/conda/envs/wps/lib/python3.7/site-packages/pywps/app/Process.py", line 250, in _run_process
    self.handler(wps_request, wps_response)  # the user must update the wps_response.
  File "/opt/wps/raven/processes/wps_hydrobasins_shape_selection.py", line 88, in _handler
    hybas_request = geoserver.get_hydrobasins_location_wfs(point, domain=domain)
  File "/opt/conda/envs/wps/lib/python3.7/site-packages/ravenpy/utilities/geoserver.py", line 521, in get_hydrobasins_location_wfs
    data = _get_location_wfs(point=coordinates, layer=layer, geoserver=geoserver)
  File "/opt/conda/envs/wps/lib/python3.7/site-packages/ravenpy/utilities/geoserver.py", line 126, in _get_location_wfs
    typename=layer, outputFormat="application/json", method="POST", **kwargs
  File "/opt/conda/envs/wps/lib/python3.7/site-packages/owslib/feature/wfs200.py", line 325, in getfeature
    u = openURL(url, data, method, timeout=self.timeout, headers=self.headers, auth=self.auth)
  File "/opt/conda/envs/wps/lib/python3.7/site-packages/owslib/util.py", line 211, in openURL
    raise ServiceException(req.text)
owslib.util.ServiceException: <?xml version="1.0" encoding="UTF-8"?><ows:ExceptionReport xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0.0" xsi:schemaLocation="http://www.opengis.net/ows/1.1 https://pavics.ouranos.ca/geoserver/schemas/ows/1.1.0/owsAll.xsd">
<ows:Exception exceptionCode="NoApplicableCode">
<ows:ExceptionText>java.lang.RuntimeException: java.io.IOException: Not available: Provinces_États_Global
java.io.IOException: Not available: Provinces_États_Global
Not available: Provinces_États_Global</ows:ExceptionText>
</ows:Exception>
</ows:ExceptionReport>

ERROR: Process._run_process(): Process error: method=wps_hydrobasins_shape_selection.py._handler, line=88, msg=<?xml version="1.0" encoding="UTF-8"?><ows:ExceptionReport xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0.0" xsi:schemaLocation="http://www.opengis.net/ows/1.1 https://pavics.ouranos.ca/geoserver/schemas/ows/1.1.0/owsAll.xsd">
<ows:Exception exceptionCode="NoApplicableCode">
<ows:ExceptionText>java.lang.RuntimeException: java.io.IOException: Not available: Provinces_États_Global
java.io.IOException: Not available: Provinces_États_Global
Not available: Provinces_États_Global</ows:ExceptionText>
</ows:Exception>
</ows:ExceptionReport>

05_Extracting_external_data.ipynb:

12:49:51  _ raven-master/docs/source/notebooks/05_Extracting_external_data.ipynb::Cell 2 _
12:49:51  Notebook cell execution failed
12:49:51  Cell 2: Cell execution caused an exception
12:49:51  
12:49:51  Input:
12:49:51  # Get the ERA5 data from the Wasabi/Amazon S3 server. Will eventually be replaced by the more efficient direct call with auto-updating timesteps.
12:49:51  # Future code:
12:49:51  '''
12:49:51  catalog_name = 'https://raw.githubusercontent.com/hydrocloudservices/catalogs/main/catalogs/atmosphere.yaml'
12:49:51  cat=intake.open_catalog(catalog_name)
12:49:51  ds=cat.era5_hourly_reanalysis_single_levels_ts.to_dask()
12:49:51  '''
12:49:51  
12:49:51  # For now, let's use this method:
12:49:51  ''' 
12:49:51  Configuration keys. Boilerplate, should not be changed.
12:49:51  '''
12:49:51  CLIENT_KWARGS = {'endpoint_url': 'https://s3.wasabisys.com','region_name': 'us-east-1'}
12:49:51  CONFIG_KWARGS = {'max_pool_connections': 100}
12:49:51  STORAGE_OPTIONS = {'anon': True,'client_kwargs': CLIENT_KWARGS,'config_kwargs': CONFIG_KWARGS}
12:49:51  
12:49:51  '''
12:49:51  Prepare the filesystem and mapper that points to the data itself on the AmazonS3 directory
12:49:51  '''
12:49:51  fsERA5 = fsspec.filesystem('s3', **STORAGE_OPTIONS)
12:49:51  mapper = fsERA5.get_mapper('s3://era5/world/reanalysis/single-levels/zarr-temporal/2021-06-30')
12:49:51  
12:49:51  '''
12:49:51  Get the ERA5 data. We will rechunk it to a single chunck to make it compatible with other codes on the platform, especially bias-correction.
12:49:51  We are also taking the daily min and max temperatures as well as the daily total precipitation.
12:49:51  '''
12:49:51  ERA5_reference=subset.subset_shape(xr.open_zarr(mapper, consolidated=True).sel(time=slice(reference_start_day,reference_stop_day)), basin_contour)
12:49:51  ERA5_tmin=ERA5_reference['t2m'].resample(time='1D').min().chunk(-1,-1,-1)
12:49:51  ERA5_tmax=ERA5_reference['t2m'].resample(time='1D').max().chunk(-1,-1,-1)
12:49:51  ERA5_pr=ERA5_reference['tp'].resample(time='1D').sum().chunk(-1,-1,-1)
12:49:51  
12:49:51  
12:49:51  Traceback:
12:49:51  
12:49:51  ---------------------------------------------------------------------------
12:49:51  CPLE_OpenFailedError                      Traceback (most recent call last)
12:49:51  fiona/_shim.pyx in fiona._shim.gdal_open_vector()
12:49:51  
12:49:51  fiona/_err.pyx in fiona._err.exc_wrap_pointer()
12:49:51  
12:49:51  CPLE_OpenFailedError: input.geojson: No such file or directory
12:49:51  
12:49:51  During handling of the above exception, another exception occurred:
12:49:51  
12:49:51  DriverError                               Traceback (most recent call last)
12:49:51  /tmp/ipykernel_348/3400712141.py in <module>
12:49:51       25 We are also taking the daily min and max temperatures as well as the daily total precipitation.
12:49:51       26 '''
12:49:51  ---> 27 ERA5_reference=subset.subset_shape(xr.open_zarr(mapper, consolidated=True).sel(time=slice(reference_start_day,reference_stop_day)), basin_contour)
12:49:51       28 ERA5_tmin=ERA5_reference['t2m'].resample(time='1D').min().chunk(-1,-1,-1)
12:49:51       29 ERA5_tmax=ERA5_reference['t2m'].resample(time='1D').max().chunk(-1,-1,-1)
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/clisops/core/subset.py in subset_shape(ds, shape, raster_crs, shape_crs, buffer, start_date, end_date, first_level, last_level)
12:49:51      658         poly = shape.copy()
12:49:51      659     else:
12:49:51  --> 660         poly = gpd.GeoDataFrame.from_file(shape)
12:49:51      661 
12:49:51      662     if buffer is not None:
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/geopandas/geodataframe.py in from_file(cls, filename, **kwargs)
12:49:51      501 
12:49:51      502         """
12:49:51  --> 503         return geopandas.io.file._read_file(filename, **kwargs)
12:49:51      504 
12:49:51      505     @classmethod
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/geopandas/io/file.py in _read_file(filename, bbox, mask, rows, **kwargs)
12:49:51      158 
12:49:51      159     with fiona_env():
12:49:51  --> 160         with reader(path_or_bytes, **kwargs) as features:
12:49:51      161 
12:49:51      162             # In a future Fiona release the crs attribute of features will
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/fiona/env.py in wrapper(*args, **kwargs)
12:49:51      398     def wrapper(*args, **kwargs):
12:49:51      399         if local._env:
12:49:51  --> 400             return f(*args, **kwargs)
12:49:51      401         else:
12:49:51      402             if isinstance(args[0], str):
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/fiona/__init__.py in open(fp, mode, driver, schema, crs, encoding, layer, vfs, enabled_drivers, crs_wkt, **kwargs)
12:49:51      255         if mode in ('a', 'r'):
12:49:51      256             c = Collection(path, mode, driver=driver, encoding=encoding,
12:49:51  --> 257                            layer=layer, enabled_drivers=enabled_drivers, **kwargs)
12:49:51      258         elif mode == 'w':
12:49:51      259             if schema:
12:49:51  
12:49:51  /opt/conda/envs/birdy/lib/python3.7/site-packages/fiona/collection.py in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, ignore_fields, ignore_geometry, **kwargs)
12:49:51      160             if self.mode == 'r':
12:49:51      161                 self.session = Session()
12:49:51  --> 162                 self.session.start(self, **kwargs)
12:49:51      163             elif self.mode in ('a', 'w'):
12:49:51      164                 self.session = WritingSession()
12:49:51  
12:49:51  fiona/ogrext.pyx in fiona.ogrext.Session.start()
12:49:51  
12:49:51  fiona/_shim.pyx in fiona._shim.gdal_open_vector()
12:49:51  
12:49:51  DriverError: input.geojson: No such file or directory

06_Raven_calibration.ipynb

12:49:51  ____ raven-master/docs/source/notebooks/06_Raven_calibration.ipynb::Cell 4 _____
12:49:51  Notebook cell execution failed
12:49:51  Cell 4: Cell execution caused an exception
12:49:51  
12:49:51  Input:
12:49:51  model.rvh.hrus=(GR4JCN.LandHRU(**salmon_land_hru_1),)
12:49:51  
12:49:51  Traceback:
12:49:51  
12:49:51  ---------------------------------------------------------------------------
12:49:51  AttributeError                            Traceback (most recent call last)
12:49:51  /tmp/ipykernel_380/4120598738.py in <module>
12:49:51  ----> 1 model.rvh.hrus=(GR4JCN.LandHRU(**salmon_land_hru_1),)
12:49:51  
12:49:51  AttributeError: 'GR4JCN_OST' object has no attribute 'rvh'

Additional Information

Similar Finch issue bird-house/finch#206

@tlvu tlvu changed the title Notebooks failure after Geoserver upgrade Raven notebooks failure after Geoserver upgrade Oct 21, 2021
@Zeitsperre
Copy link
Contributor

I think I know what the issue is with the error on 03_Extract_geographical_watershed_properties.ipynb. It looks like the link to that layer is broken. The strange thing is nothing in RavenWPS/RavenPy is asking for that layer. I'm going to remove it and see if that fixes the issue.

@Zeitsperre
Copy link
Contributor

I was right. I've removed that broken datastore and the cells for notebook 03 are passing now. This may cause problems elsewhere, but we'll deal with them as they arise.

@Zeitsperre
Copy link
Contributor

Zeitsperre commented Oct 26, 2021

I spoke too soon. Looks like Birdy v0.8.0 isn't converting objects properly right now:
image

Birdy has three options for GeoTIFF conversion here, so where's the issue? https://github.com/bird-house/birdy/blob/24dbbd1f1d431b22f918486ab2974019edea4888/birdy/client/converters.py#L252

@huard Do you have any ideas here?

@Zeitsperre
Copy link
Contributor

For 05_Extracting_external_data.ipynb, the notebook relies on a file that is created by another notebook:

basin_contour = 'input.geojson' # Can be generated using notebook "04_Delineating watersheds"

If we can assume that the notebook will run sequentially (can we?), we could add a call to write out this input.geojson after we've collected it in notebook_04:
image

I personally don't like this approach, though, as it relies on outputs from another notebook that could fail independently. Is there a better method?

@Zeitsperre
Copy link
Contributor

For 06_Raven_calibration.ipynb, this looks oddly familiar and could be an issue that we've already resolved in a new version of RavenPy. @cjauvin, can you confirm?

@cjauvin
Copy link
Collaborator

cjauvin commented Oct 26, 2021

Sorry I'm not sure I understand what goes on here, as this is not an area of RavenPy I'm very familiar with.

@tlvu
Copy link
Contributor Author

tlvu commented Oct 26, 2021

If we can assume that the notebook will run sequentially (can we?), we could add a call to write out this input.geojson after we've collected it in notebook_04: image

I personally don't like this approach, though, as it relies on outputs from another notebook that could fail independently. Is there a better method?

Richard numbered the notebooks so he implies the user will run them sequentially manually but I think this assumption do not hold under Jenkins. We have to find an alternative.

@cjauvin
Copy link
Collaborator

cjauvin commented Feb 15, 2022

About this, which can be found in a stacktrace above:

model.rvh.hrus=(GR4JCN.LandHRU(**salmon_land_hru_1),)

this has been deprecated for a while in RavenPy: the way to reference rvh (and others) is now model.config.rvh.

@cjauvin
Copy link
Collaborator

cjauvin commented Feb 15, 2022

For 06_Raven_calibration.ipynb, this looks oddly familiar and could be an issue that we've already resolved in a new version of RavenPy. @cjauvin, can you confirm?

And I now realize, many months later, that the answer to this question is thus: yes, that's the case!

@richardarsenault
Copy link
Contributor

The underlying issue with files requiring data from previous-run notebooks has been solved in ravenpy, storing data on raven-testdata for the tutorials. I think this can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants