Skip to content

Commit

Permalink
Merge branch 'master' of github.com:bird-house/birdhouse-deploy into …
Browse files Browse the repository at this point in the history
…cowbird-jupyter-e2e-test
  • Loading branch information
cwcummings committed Nov 30, 2023
2 parents 11f04d4 + e408cea commit 2120b5e
Show file tree
Hide file tree
Showing 29 changed files with 450 additions and 18 deletions.
6 changes: 3 additions & 3 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 1.38.0
current_version = 1.40.0
commit = True
tag = False
tag_name = {new_version}
Expand Down Expand Up @@ -30,11 +30,11 @@ search = {current_version}
replace = {new_version}

[bumpversion:file:RELEASE.txt]
search = {current_version} 2023-11-21T16:50:24Z
search = {current_version} 2023-11-30T18:27:41Z
replace = {new_version} {utcnow:%Y-%m-%dT%H:%M:%SZ}

[bumpversion:part:releaseTime]
values = 2023-11-21T16:50:24Z
values = 2023-11-30T18:27:41Z

[bumpversion:file(version):birdhouse/config/canarie-api/docker_configuration.py.template]
search = 'version': '{current_version}'
Expand Down
3 changes: 3 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@ component/geoserver:
component/jupyterhub:
- birdhouse/**/jupyterhub/**/*

component/STAC:
- birdhouse/**/*stac*/**/*

feature/WPS:
- birdhouse/**/finch/**/*
- birdhouse/**/flyingpigeon/**/*
Expand Down
79 changes: 79 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,85 @@
## Fixes
- Updates incorrect WPS outputs resource name in the cowbird config.

[1.40.0](https://github.com/bird-house/birdhouse-deploy/tree/1.40.0) (2023-11-30)
------------------------------------------------------------------------------------------------------------------

- `optional-components/stac-data-proxy`: add a new feature to allow hosting of local STAC assets.

The new component defines variables `STAC_DATA_PROXY_DIR_PATH` (default `${DATA_PERSIST_ROOT}/stac-data`) and
`STAC_DATA_PROXY_URL_PATH` (default `/data/stac`) that are aliased (mapped) under `nginx` to provide a URL
where locally hosted STAC assets can be downloaded from. This allows a server node to be a proper data provider,
where its STAC-API can return Catalog, Collection and Item definitions that points at these local assets available
through the `STAC_DATA_PROXY_URL_PATH` endpoint.

When enabled, this component can be combined with `optional-components/secure-data-proxy` to allow per-resource
access control of the contents under `STAC_DATA_PROXY_DIR_PATH` by setting relevant Magpie permissions under service
`secure-data-proxy` for children resources that correspond to `STAC_DATA_PROXY_URL_PATH`. Otherwise, the path and
all of its contents are publicly available, in the same fashion that WPS outputs are managed without
`optional-components/secure-data-proxy`. More details are provided under the component's
[README](./birdhouse/optional-components/README.rst#provide-a-proxy-for-local-stac-asset-hosting).

- `optional-components/stac-public-access`: add public write permission for `POST /stac/search` request.

Since [`pystac_client`](https://github.com/stac-utils/pystac-client), a common interface to interact with STAC API,
employs `POST` method by default to perform search, the missing permission caused an unexpected error for users that
are not aware of the specific permission control of Magpie. Since nothing is created by that endpoint, but rather,
the POST'ed body employs the convenient JSON format to provide search criteria, it is safe to set this permission
when the STAC service was configured to be publicly searchable.

[1.39.2](https://github.com/bird-house/birdhouse-deploy/tree/1.39.2) (2023-11-30)
------------------------------------------------------------------------------------------------------------------

## Changes

- Jupyterhub: periodically check whether the logged-in user still have permission to access

By setting the `JUPYTERHUB_CRYPT_KEY` environment variable in the `env.local` file, jupyterhub will store user's
authentication information (session cookie) in the database. This allows jupyterhub to periodically check whether the
user still has permission to access jupyterhub (the session cookie is not expired and the permission have not
changed).

The minimum duration between checks can be set with the `JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE` variable which is an
integer (in seconds).

Note that users who are already logged in to jupyterhub will need to log out and log in for these changes to take
effect.

To forcibly log out all users currently logged in to jupyterhub you can run the following command to force the
recreation of the cookie secret:

```shell
docker exec jupyterhub rm /persist/jupyterhub_cookie_secret && docker restart jupyterhub
```

[1.39.1](https://github.com/bird-house/birdhouse-deploy/tree/1.39.1) (2023-11-29)
------------------------------------------------------------------------------------------------------------------

## Changes

- Limit usernames in Magpie to match restrictions by Jupyterhub's Dockerspawner
When Jupyterhub spawns a new jupyterlab container, it escapes any non-ascii, non-digit character in the username.
This results in a username that may not match the expected username (as defined by Magpie). This mismatch results in
the container failing to spawn since expected volumes cannot be mounted to the jupyterlab container.
This fixes the issue by ensuring that juptyerhub does not convert the username that is receives from Magpie.
Note that this updates the Magpie version.
[1.39.0](https://github.com/bird-house/birdhouse-deploy/tree/1.39.0) (2023-11-27)
------------------------------------------------------------------------------------------------------------------
## Changes
- Add a Magpie Webhook to create the Magpie resources corresponding to the STAC-API path elements when a `STAC-API`
`POST /collections/{collection_id}` or `POST /collections/{collection_id}/items/{item_id}` request is accomplished.
- When creating the STAC `Item`, the `source` entry in `links` corresponding to a `THREDDS` file on the same instance
is used to define the Magpie `resource_display_name` corresponding to a file to be mapped later on
(eg: a NetCDF `birdhouse/test-data/tc_Anon[...].nc`).
- Checking same instance `source` path is necessary because `STAC` could refer to external assets, and we do not want
to inject Magpie resource that are not part of the active instance where the hook is running.
[1.38.0](https://github.com/bird-house/birdhouse-deploy/tree/1.38.0) (2023-11-21)
------------------------------------------------------------------------------------------------------------------
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generic variables
override SHELL := bash
override APP_NAME := birdhouse-deploy
override APP_VERSION := 1.38.0
override APP_VERSION := 1.40.0

# utility to remove comments after value of an option variable
override clean_opt = $(shell echo "$(1)" | $(_SED) -r -e "s/[ '$'\t'']+$$//g")
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,13 @@ for a full-fledged production platform.
* - releases
- | |latest-version| |commits-since|

.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/1.38.0.svg
.. |commits-since| image:: https://img.shields.io/github/commits-since/bird-house/birdhouse-deploy/1.40.0.svg
:alt: Commits since latest release
:target: https://github.com/bird-house/birdhouse-deploy/compare/1.38.0...master
:target: https://github.com/bird-house/birdhouse-deploy/compare/1.40.0...master

.. |latest-version| image:: https://img.shields.io/badge/tag-1.38.0-blue.svg?style=flat
.. |latest-version| image:: https://img.shields.io/badge/tag-1.40.0-blue.svg?style=flat
:alt: Latest Tag
:target: https://github.com/bird-house/birdhouse-deploy/tree/1.38.0
:target: https://github.com/bird-house/birdhouse-deploy/tree/1.40.0

.. |readthedocs| image:: https://readthedocs.org/projects/birdhouse-deploy/badge/?version=latest
:alt: ReadTheDocs Build Status (latest version)
Expand Down
2 changes: 1 addition & 1 deletion RELEASE.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.38.0 2023-11-21T16:50:24Z
1.40.0 2023-11-30T18:27:41Z
9 changes: 9 additions & 0 deletions birdhouse/components/stac/config/magpie/config.yml.template
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ providers:
c4i: false
type: api
sync_type: api
hooks:
- type: response
path: "/stac/collections/?"
method: POST
target: /opt/birdhouse/src/magpie/hooks/stac_hooks.py:create_collection_resource
- type: response
path: "/stac/collections/[\\w-]+/items/?"
method: POST
target: /opt/birdhouse/src/magpie/hooks/stac_hooks.py:create_item_resource

permissions:
# create a default 'stac' resource under 'stac' service
Expand Down
136 changes: 136 additions & 0 deletions birdhouse/components/stac/config/magpie/stac_hooks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
These hooks will be running within Twitcher, using MagpieAdapter context, applied for STAC requests.
The code below can make use of any package that is installed by Magpie/Twitcher.
.. seealso::
Documentation about Magpie/Twitcher request/response hooks is available here:
https://pavics-magpie.readthedocs.io/en/latest/configuration.html#service-hooks
"""

import re
from typing import TYPE_CHECKING, List, Dict

from magpie.api.management.resource import resource_utils as ru
from magpie.api.requests import get_service_matchdict_checked
from magpie.models import Route
from magpie.utils import get_logger
from magpie.db import get_session_from_other
from ziggurat_foundations.models.services.resource import ResourceService

if TYPE_CHECKING:
from pyramid.response import Response
from sqlalchemy.orm.session import Session

LOGGER = get_logger("magpie.stac")

def create_collection_resource(response):
# type: (Response) -> Response
"""
Create the stac collection resource
"""
request = response.request
body = request.json
collection_id = body["id"]
try:
display_name = extract_display_name(body["links"])
except Exception as exc:
LOGGER.error("Error when extracting display_name from links %s %s", body["links"], str(exc), exc_info=exc)
return response

# note: matchdict reference of Twitcher owsproxy view is used, just so happens to be same name as Magpie
service = get_service_matchdict_checked(request)
# Getting a new session from the request, since the current session found in the request is already handled with his own transaction manager.
session = get_session_from_other(request.db)
try:
# Create the resource tree
create_resource_tree(f"stac/collections/{collection_id}", 0, service.resource_id , session, display_name)
session.commit()

except Exception as exc:
LOGGER.error("Unexpected error while creating the collection %s %s", display_name, str(exc), exc_info=exc)
session.rollback()

return response

def create_item_resource(response):
# type: (Response) -> Response
"""
Create the stac item resource
"""
request = response.request
body = request.json
item_id = body["id"]
try:
display_name = extract_display_name(body["links"])
except Exception as exc:
LOGGER.error("Error when extracting display_name from links %s %s", body["links"], str(exc), exc_info=exc)
return response

# Get the <collection_id> from url -> /collections/{collection_id}/items
collection_id = re.search(r'(?<=collections/)[0-9a-zA-Z_.-]+?(?=/items)', request.url).group()

# note: matchdict reference of Twitcher owsproxy view is used, just so happens to be same name as Magpie
service = get_service_matchdict_checked(request)
# Getting a new session from the request, since the current session found in the request is already handled with his own transaction manager.
session = get_session_from_other(request.db)
try:
# Create the resource tree
create_resource_tree(f"stac/collections/{collection_id}/items/{item_id}", 0, service.resource_id, session, display_name)
session.commit()

except Exception as exc:
LOGGER.error("Unexpected error while creating the item %s %s", display_name, str(exc), exc_info=exc)
session.rollback()

return response

def extract_display_name(links):
# type: (List[Dict[str, str]]) -> str
"""
Extract THREDD path from a STAC links
"""
display_name = None
for link in links:
if link["rel"] == "source":
# Example of title `thredds:birdhouse/CMIP6`
display_name = link["title"]
break
if not display_name:
raise ValueError("The display name was not extracted properly")

return display_name

def create_resource_tree(resource_tree, current_depth, parent_id, session, display_name):
# type: (str, int, int, session, str) -> None
"""
Create the resource tree on Magpie
"""
tree = resource_tree.split("/")
# We are at the max depth of the tree.
if current_depth > len(tree) - 1:
return

resource_name = tree[current_depth]
query = session.query(ResourceService.model).filter(ResourceService.model.resource_name == resource_name, ResourceService.model.parent_id == parent_id)
resource = query.first()

if resource is not None:
# Since the resource exists, we can use its id to create the next resource.
parent_id = resource.resource_id
next_depth = current_depth + 1
create_resource_tree(resource_tree, next_depth, parent_id, session, display_name)

# The resource wasn't found in the current depth, we need to create it.
else:
# Creating the last resource in the tree, we need to use the display_name.
if current_depth == len(tree) - 1:
ru.create_resource(resource_name, display_name, Route.resource_type_name, parent_id, db_session=session)
else:
# Creating the resource somewhere in the middle of the tree before using its id.
node = ru.create_resource(resource_name, None, Route.resource_type_name, parent_id, db_session=session)
parent_id = node.json["resource"]["resource_id"]
next_depth = current_depth + 1
create_resource_tree(resource_tree, next_depth, parent_id, session, display_name)
10 changes: 10 additions & 0 deletions birdhouse/components/stac/config/twitcher/docker-compose-extra.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version: "3.4"

services:
# extend twitcher with MagpieAdapter hooks employed for STAC proxied requests
twitcher:
volumes:
# NOTE: MagpieAdapter hooks are defined within Magpie config, but it is actually Twitcher proxy that runs them
# target mount location depends on 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable that is found under `birdhouse/config/twitcher/docker-compose-extra.yml`
- ./components/stac/config/magpie/config.yml:/opt/birdhouse/src/magpie/config/stac-config.yml:ro
- ./components/stac/config/magpie/stac_hooks.py:/opt/birdhouse/src/magpie/hooks/stac_hooks.py:ro
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@ services:
twitcher:
volumes:
# NOTE: MagpieAdapter hooks are defined within Magpie config, but it is actually Twitcher proxy that runs them
# target mount location depends on main docker-compose 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable
# target mount location depends on 'MAGPIE_PROVIDERS_CONFIG_PATH' environment variable that is found under `birdhouse/config/twitcher/docker-compose-extra.yml`
- ./components/weaver/config/magpie/config.yml:/opt/birdhouse/src/magpie/config/weaver-config.yml:ro
- ./components/weaver/config/magpie/weaver_hooks.py:/opt/birdhouse/src/magpie/hooks/weaver_hooks.py:ro
8 changes: 4 additions & 4 deletions birdhouse/config/canarie-api/docker_configuration.py.template
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,8 @@ SERVICES = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '1.38.0',
'releaseTime': '2023-11-21T16:50:24Z',
'version': '1.40.0',
'releaseTime': '2023-11-30T18:27:41Z',
'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
Expand Down Expand Up @@ -142,8 +142,8 @@ PLATFORMS = {
# NOTE:
# Below version and release time auto-managed by 'make VERSION=x.y.z bump'.
# Do NOT modify it manually. See 'Tagging policy' in 'birdhouse/README.rst'.
'version': '1.38.0',
'releaseTime': '2023-11-21T16:50:24Z',
'version': '1.40.0',
'releaseTime': '2023-11-30T18:27:41Z',
'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
Expand Down
12 changes: 11 additions & 1 deletion birdhouse/config/jupyterhub/default.env
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# are applied and must be added to the list of DELAYED_EVAL.

export JUPYTERHUB_DOCKER=pavics/jupyterhub
export JUPYTERHUB_VERSION=4.0.2-20231002
export JUPYTERHUB_VERSION=4.0.2-20231127

# Jupyter single-user server images, can be overriden in env.local to have a space separated list of multiple images
export DOCKER_NOTEBOOK_IMAGES="pavics/workflow-tests:230601"
Expand Down Expand Up @@ -64,6 +64,15 @@ export JUPYTERHUB_CONFIG_OVERRIDE=""
# recommended as it may permit unauthorized users from accessing jupyterhub.
export JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL='http://twitcher:8000/ows/verify/jupyterhub'

# 32 byte hex-encoded key used to encrypt a user's authentication state in the juptyerhub database.
# If set, jupyterhub will periodically check if the user still has permission to access jupyterhub (according to Magpie)
export JUPYTERHUB_CRYPT_KEY=

# Jupyterhub will check if the current logged in user still has permission to access jupyterhub (according to Magpie)
# if their authentication information is older that this value (in seconds). This value is only applied if
# JUPYTERHUB_CRYPT_KEY is set.
export JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE=60

export DELAYED_EVAL="
$DELAYED_EVAL
JUPYTERHUB_USER_DATA_DIR
Expand All @@ -86,6 +95,7 @@ OPTIONAL_VARS="
\$JUPYTERHUB_DOCKER
\$JUPYTERHUB_VERSION
\$JUPYTERHUB_AUTHENTICATOR_AUTHORIZATION_URL
\$JUPYTERHUB_AUTHENTICATOR_REFRESH_AGE
\$JUPYTER_IDLE_SERVER_CULL_TIMEOUT
\$JUPYTER_IDLE_KERNEL_CULL_TIMEOUT
\$JUPYTER_IDLE_KERNEL_CULL_INTERVAL
Expand Down
1 change: 1 addition & 0 deletions birdhouse/config/jupyterhub/docker-compose-extra.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ services:
MOUNT_IMAGE_SPECIFIC_NOTEBOOKS: ${MOUNT_IMAGE_SPECIFIC_NOTEBOOKS}
USER_WORKSPACE_UID: ${USER_WORKSPACE_UID}
USER_WORKSPACE_GID: ${USER_WORKSPACE_GID}
JUPYTERHUB_CRYPT_KEY: ${JUPYTERHUB_CRYPT_KEY}
volumes:
- ./config/jupyterhub/jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py:ro
- ./config/jupyterhub/custom_templates:/custom_templates:ro
Expand Down
Loading

0 comments on commit 2120b5e

Please sign in to comment.