Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

❓ [Question]: Can we change PAVICS to Birdhouse everywhere #414

Closed
mishaschwartz opened this issue Dec 14, 2023 · 8 comments · Fixed by #428
Closed

❓ [Question]: Can we change PAVICS to Birdhouse everywhere #414

mishaschwartz opened this issue Dec 14, 2023 · 8 comments · Fixed by #428
Assignees
Labels
question Further information is requested

Comments

@mishaschwartz
Copy link
Collaborator

Questions

As we're writing documentation for Marble (which uses the birdhouse software), it seems strange to describe configuration and deployment instructions using language that refers specifically to PAVICS.

I would really like it if we changed references to "PAVICS" in the documentation, code, etc. to "Birdhouse".

Unfortunately, to do this fully, this will require some major changes for existing deployments. This is because PAVICS is currently in variable names and file names which means that we'll need to update env.local files and CI scripts.

For example:

  • references to PAVICS_FQDN will need to be changed to BIRDHOUSE_FQDN
  • scripts that call pavics-compose.sh will need be changed to call birdhouse-compose.sh

I understand that this is a going to be a big pain to change for existing deployments so I propose we do one of the following:

  1. change PAVICS to Birdhouse everywhere and do the hard work of updating the current deployments where needed
  2. change PAVICS to Birdhouse everywhere and leave in a bunch of redundant environment variables and symlinked files so that existing deployments will work with the old setup. Then we can take our time changing things over and officially drop support for the old variable and file names in version 3.
  3. add a disclaimer to the documentation saying that for legacy reasons, some of the files use the word PAVICS (but don't worry about it).
  4. some other option

Obviously, option 3 would be the easiest to implement but options 1 and 2 would improve the code base (in my opinion).

I'm interested to hear what people think of these options.

@huard @fmigneault @tlvu @eyvorchuk

References

Environment

Information Value
Server/Platform URL
Version Tag/Commit
Related issues/PR
Related components
@mishaschwartz mishaschwartz added the question Further information is requested label Dec 14, 2023
@fmigneault
Copy link
Collaborator

Conceptually, I agree it should be renamed to birdhouse.

I was actually in the process of adjusting/adding some override variables to replace items that were explicitly enforced to Ouranos/PAVICS details for all instances, specifically :

'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
'category': 'Resource/Cloud Management',
'tags': ['Climatology']
},
'stats': {
'method': '.*',
'route': '(?!)' # this will be set by CANARIE_STATS_ROUTES (see below)
},
'redirect': {
'doc': 'https://pavics-sdi.readthedocs.io/en/latest/arch/backend.html',
'releasenotes': 'https://github.com/bird-house/birdhouse-deploy/blob/master/CHANGES.md',
'support': 'https://github.com/bird-house/birdhouse-deploy/issues',
'source': 'https://github.com/bird-house/birdhouse-deploy',
'tryme': 'https://${PAVICS_FQDN_PUBLIC}',
'licence': 'https://pavics-sdi.readthedocs.io/en/latest/license.html',
'provenance': 'https://pavics-sdi.readthedocs.io/en/latest/provenance/index.html'
},

Preferably, those would be adjusted at the same time to avoid opening the configs multiple times.
I was trying to figure out a way to make these values defaults with "STRONG suggestion/warning" to override them, but if the choice of this discussion is to introduce breaking changes with birdhouse renames, I would prefer to update these simultaneously to required VARS.

I would like to propose a derived approach in between 1-2 regarding the variables.
IMO, we should explicitly replace all PAVICS_ variables to BIRDHOUSE_ in the code.
However, the core default.env would define them like so:

export BIRDHOUSE_FQDN="${PAVICS_FQDN}"

Since (current) pavics-compose.sh checks these required VARS to be non-empty, it does not matter if they are overridden directly or indirectly by PAVICS_... names. There would not be duplicate variables in this code base except these default expansions. Only deployments that explicitly set PAVICS_... in their env.local would see duplicate variables for equivalent concepts.

I am also OK with option 1 (for variables) if everyone agrees. Setting 2-3 extra vars in our env.local is not much extra work.

The biggest issue IMO is the pavics-compose.sh file. Within the code, a BIRDHOUSE_COMPOSE could be provided to reference/override the name as needed and avoid future issues with hard coded pavics-compose.sh references. The actual file name is not that important in that case. The main pain point is external scripts such as the CI pointing directly at pavics-compose.sh. Doing a search-replace of pavics-compose.sh or generating the symlink during cloud-init is equivalent (in terms of work/validation time) for me.

@mishaschwartz
Copy link
Collaborator Author

I was trying to figure out a way to make these values defaults with "STRONG suggestion/warning" to override them, but if the choice of this discussion is to introduce breaking changes with birdhouse renames, I would prefer to update these simultaneously to required VARS.

Do you mean that you'd like to include these changes in #415 ? That's fine with me

I am also OK with option 1 (for variables) if everyone agrees. Setting 2-3 extra vars in our env.local is not much extra work.

I would prefer this since it seems like minimal one time effort to update the env.local file. But, I'm also ok with this strategy if @tlvu @eyvorchuk would prefer:

I would like to propose a derived approach in between 1-2 regarding the variables.
IMO, we should explicitly replace all PAVICS_ variables to BIRDHOUSE_ in the code.

Ok, now about pavics-compose.sh:

The biggest issue IMO is the pavics-compose.sh file. Within the code, a BIRDHOUSE_COMPOSE could be provided to reference/override the name as needed and avoid future issues with hard coded pavics-compose.sh references.

I like this idea I lot.

Another thing we can do as well is to fix and extend the make start and make stop commands with something like make compose (that would then pass on any additional argument to the actual compose file).

That was we can recommend any external scripts in the future to just use the make commands.

Doing a search-replace of pavics-compose.sh or generating the symlink during cloud-init is equivalent (in terms of work/validation time) for me.

I agree, I'm willing to make this update once to external scripts.

@fmigneault
Copy link
Collaborator

Another thing we can do as well is to fix and extend the make start and make stop commands with something like make compose (that would then pass on any additional argument to the actual compose file).

From experience, I have found that passing "additional arguments" via make is sometimes more troublesome or unnatural for inexperienced users. For example, to replicate up -d, you have to do something like:

make COMPOSE_XARGS="up -d" compose
# or 
COMPOSE_XARGS="up -d" make compose

With the target that does:

compose:
    @$(SHELL) $(SCRIPT) $(COMPOSE_XARGS)

For simple uses like up -d, it is acceptable. For more complicated values in COMPOSE_XARGS that could require various quotes, it can become an escape nightmare. I'm not against having it available, though.

@tlvu
Copy link
Collaborator

tlvu commented Dec 18, 2023

I have no problem with renaming variables in env.local since the new names can be easily added ahead of time to env.local and the transitions will be completely seamless.

The problems I see is with external repos that hooks into the platform and that use those variables. Ex: the default.env or docker-compose-extra.yml of those external repos use PAVICS_FQDN for example.

All the config variables in the various default.env and env.local are like the stable interface between the platform and external components.

To ease the transition, I suggest to set all the old names even if they are not used in the platform anymore. So at the end of read-configs.include.sh we can create a new function, something like allow_backward_compat(), and in there we restore export PAVICS_FQDN="$BIRDHOUSE_FQDN" and so on for all the old PAVICS_ ... names.

As for the script pavics-compose.sh, how about creating a symlink pavics-compose.sh -> birdhouse-compose.sh to keep backward-compat with external repos that we have no control over? Creating BIRDHOUSE_COMPOSE var to ease future rename is a good as well. Is this the only script having pavics in the name? How about we avoid using platform name specific in scripts and just call it manager.sh or compose-wrapper.sh or something generic that designate its role only?

PAVICS was the name of a previous funding project (currently DACCS). We learned our lesson and did not use DACCS anywhere internally. Should we also learn the lesson as well and limit using platform names (Birdhouse) in the "public API" (config var and entrypoint scripts) to protect us from future branding rename? It's the same rename effort, might as well choose new names that will prevent the need for future renames.

I agree with @fmigneault that using make is awkward when invoking scripts. However, for workflow that involves calling multiples scripts in a sequence, make can be useful to capture the various steps. So no objection to add make support but it should not be the only way to interact with the platform.

@mishaschwartz
Copy link
Collaborator Author

mishaschwartz commented Dec 19, 2023

To ease the transition, I suggest to set all the old names even if they are not used in the platform anymore.

Yes I think this is a good idea

As for the script pavics-compose.sh, how about creating a symlink pavics-compose.sh -> birdhouse-compose.sh

I'm happy to do this but you'll have to make sure that your scripts can handle a symlinked file as expected. It might be just as much effort to change the references in your scripts.

Is this the only script having pavics in the name?

There is also:

  • PAVICS-deploy.logrotate (this one is used by the scheduler component and should be easy to update)
  • trigger-pavicscrawler (this one is deprecated so I'm not worried about changing it)
  • configure-pavics.sh (this is called by provision.sh but maybe it's called by external scripts as well? ... I don't know because I'm not using vagrant)

Should we also learn the lesson as well and limit using platform names (Birdhouse) in the "public API" (config var and entrypoint scripts) to protect us from future branding rename?

I don't think this in necessary. The name of the software in this repo is "Birdhouse", that is different from a project that uses this software (e.g. PAVICS, DACCS, Marble, etc.). So I'm ok if variables and files in this project refer to "Birdhouse" as long as it doesn't refer to any specific project that uses the software.

So no objection to add make support but it should not be the only way to interact with the platform.

Ok I'm happy with this

@mishaschwartz
Copy link
Collaborator Author

Thank you @tlvu and @fmigneault for your feedback! Based on the discussion above I'm going to propose the following plan. Let me what you think:

  1. Change all PAVICS_ variables to BIRDHOUSE_
  2. Rename all files with PAVICS in the name to birdhouse
  3. recreate pavics-compose.sh that displays a deprecation warning and then calls birdhouse-compose.sh
    • calling pavics-compose.sh will also call the allow_backward_compat() function that redefines PAVICS_ variables for any external component that uses them.
  4. Create a migration guide that describes how to update env.local and any external scripts to support the new naming conventions
  5. make sure that the make start and make stop commands work with the new conventions (but don't add any more make commands for now)

@fmigneault
Copy link
Collaborator

fmigneault commented Dec 19, 2023

For point 3, I suggest reusing the warning strategy introduced in #415. A slightly different messages could be provided when detecting PAVICS_ names.
@tlvu pointed out that other variables such as default passwords could be added in those warnings. I think this is a good idea and we should combine these warning mechanisms as much as possible to avoid duplicating logic.

Sounds good to me for the rest of the proposal.

@mishaschwartz
Copy link
Collaborator Author

we should combine these warning mechanisms as much as possible to avoid duplicating logic.

I agree.

I can wait until #415 is finished and then add another PR using the strategies you introduce there

@mishaschwartz mishaschwartz assigned mishaschwartz and unassigned matprov and tlvu Dec 19, 2023
fmigneault added a commit that referenced this issue Feb 23, 2024
## Overview

Add server details overrides and logging utilities.

## Changes

**Non-breaking changes**

- Compose script utilities:
  * Add `BIRDHOUSE_COLOR` option and various logging/messaging definitions in `birdhouse/scripts/logging.include.sh`.
  * Replace all explicit color "logging" related `echo` in scripts by utility variables
    `MSG_DEBUG`, `MSG_INFO`,  `MSG_WARN` and `MSG_ERROR` as applicable per respective messages.
  * Unify all `birdhouse/scripts` utilities to employ the same `COMPOSE_DIR` variable (auto-resolved or explicitly set)
    in order to include or source any relevant dependencies they might have within the `birdhouse-deploy` repository.
  * Add `info` option (ie: `pavics-compose.sh info`) that will stop processing just before `docker-compose` call.
    This can be used to run a "dry-run" of the command and validate that was is loaded is as expected, by inspecting
    provided log messages.

- Defaults:
  * Add multiple `SERVER_[...]` variables with defaults using previously hard coded values referring to PAVICS.
    These variables use a special combination of `DELAYED_EVAL` and `OPTIONAL_VARS` definitions that can make use
    of a variable formatted as `<ANY_NAME>='${__DEFAULT__<ANY_NAME>}'` that will print a warning messages indicating
    that the default is employed, although *STRONGLY* recommended to be overridden. This allows a middle ground between
    backward-compatible `env.local` while flagging potentially misused configurations.

- Canarie-API: updated references
  * Use the new `SERVER_[...]` variables.
  * Replace the LICENSE URL of the server node pointing
    at [Ouranosinc/pavics-sdi](https://github.com/Ouranosinc/pavics-sdi) instead
    of intended [bird-house/birdhouse-deploy](https://github.com/bird-house/birdhouse-deploy).


**Breaking changes**
- n/a

## Related Issue / Discussion

- Partially related to #414
mishaschwartz added a commit that referenced this issue Jun 4, 2024
## Overview
For historical reasons the name PAVICS was used in variable names,
constants and filenames in this repo to refer
to the software stack in general. This was because, for a long time, the
PAVICS deployment of this stack was the
only one that was being used in production. However, now that multiple
deployments of this software exist in
production (that are not named PAVICS), we remove unnecessary references
to PAVICS in order to reduce confusion
for maintainers and developers who may not be aware of the historical
reasons for the PAVICS name.

  This update makes the following changes:

* The string ``PAVICS`` in environment variables, constant values, and
file names have been changed to
    ``BIRDHOUSE`` (case has been preserved where possible).
    * For example:
      * ``PAVICS_FQDN`` -> ``BIRDHOUSE_FQDN``
      * ``pavics_compose.sh`` -> ``birdhouse_compose.sh``
* ``THREDDS_DATASET_LOCATION_ON_CONTAINER='/pavics-ncml'`` ->
``THREDDS_DATASET_LOCATION_ON_CONTAINER='/birdhouse-ncml'``
* Comment strings and documentation that refers to the software stack as
``PAVICS`` have been changed to use
    ``Birdhouse``.
* Recreated the ``pavics-compose.sh`` script that runs
``birdhouse-compose.sh`` in backwards compatible mode.
* Backwards compatible mode means that variables in ``env.local`` that
contain the string ``PAVICS`` will be used
to set the equivalent variable that contains ``BIRDHOUSE``. For example,
the ``PAVICS_FQDN`` variable set in
the ``env.local`` file will be used to set the value of
``BIRDHOUSE_FQDN``.
* Create a new entrypoint in ``bin/birdhouse`` that can be used to
invoke ``pavics-compose.sh`` or
``birdhouse-compose.sh`` from one convenient location. This script also
includes some useful options and provides
a generic entrypoint to the stack that can be extended in the future.
  * Removed unused variables:
    * `CMIP5_THREDDS_ROOT`

### Migration Guide

- Update ``env.local`` file to replace all variables that contain
``PAVICS`` with ``BIRDHOUSE``.
Variable names have also been updated to ensure that they start with the
prefix ``BIRDHOUSE_``.
    * see `env.local.example` to see new variable names
* see the ``BACKWARDS_COMPATIBLE_VARIABLES`` variable (defined in
`default.env`) for a
      full list of changed environment variable names.
- Update any external scripts that access the old variable names
directly to use the updated variable names.
- Update any external scripts that access any of the following files to
use the new file name:

    | old file name           | new file name              |
    |-------------------------|----------------------------|
    | pavics-compose.sh       | birdhouse-compose.sh       |
    | PAVICS-deploy.logrotate | birdhouse-deploy.logrotate |
    | configure-pavics.sh     | configure-birdhouse.sh     |
    | trigger-pavicscrawler   | trigger-birdhousecrawler   |

- The following default values have changed. If your deployment was
using the old default value, update your
    ``env.local`` file to explicitly set the old default values.

| old variable name | new variable name | old default value | new
default value |

|--------------------------------------------|--------------------------------------|---------------------------------|:------------------------------------------------------|
| POSTGRES_PAVICS_USERNAME | BIRDHOUSE_POSTGRES_USERNAME |
postgres-pavics | postgres-birdhouse |
| THREDDS_DATASET_LOCATION_ON_CONTAINER | (no change) | /pavics-ncml |
/birdhouse-datasets |
| THREDDS_SERVICE_DATA_LOCATION_ON_CONTAINER | (no change) |
/pavics-data | /birdhouse-service-data |
| THREDDS_DATASET_LOCATION_ON_HOST | (no change) |
'${DATA_PERSIST_ROOT}/ncml' |
'${BIRDHOUSE_DATA_PERSIST_ROOT}/thredds-datasets' |
| THREDDS_SERVICE_DATA_LOCATION_ON_HOST | (no change) |
'${DATA_PERSIST_ROOT}/datasets' |
'${BIRDHOUSE_DATA_PERSIST_ROOT}/thredds-service-data' |
| (hardcoded) | BIRDHOUSE_POSTGRES_DB | pavics | birdhouse |
| PAVICS_LOG_DIR | BIRDHOUSE_LOG_DIR | /var/log/PAVICS |
/var/log/birdhouse |
| (hardcoded) | GRAFANA_DEFAULT_PROVIDER_FOLDER | Local-PAVICS |
Local-Birdhouse |
| (hardcoded) | GRAFANA_DEFAULT_PROVIDER_FOLDER_UUID | local-pavics |
local-birdhouse |
| (hardcoded) | GRAFANA_PROMETHEUS_DATASOURCE_UUID |
local_pavics_prometheus | local_birdhouse_prometheus |

- Update any jupyter notebooks that make use of the `PAVICS_HOST_URL`
environment variable to use the new
    `BIRDHOUSE_HOST_URL` instead.
- Set the ``BIRDHOUSE_POSTGRES_DB`` variable to ``pavics`` in the
``env.local`` file. This value was previously
hardcoded to the string ``pavics`` so to maintain backwards
compatibility with any existing databases this should be
kept the same. If you do want to update to the new database name, you
will need to rename the existing database.
For example, the following will update the existing database named
``pavics`` to ``birdhouse`` (assuming the old
    default values for the postgres username):

    ```shell
docker exec -it postgres psql -U postgres-pavics -d postgres -c 'ALTER
DATABASE pavics RENAME TO birdhouse'
    ```

You can then update the ``env.local`` file to the new variable name and
restart the stack
- Set the ``BIRDHOUSE_POSTGRES_USER`` variable to ``postgres-pavics`` in
the ``env.local`` file if you would like to
preserve the old default value. If you would like to change the value of
``BIRDHOUSE_POSTGRES_USER`` then also
update the name for any running postgres instances. For example, the
following will update the user named
    ``postgres-pavics`` to ``postgres-birdhouse``:

    ```shell
docker exec -it postgres psql -U postgres-pavics -d postgres -c 'CREATE
USER "tmpsuperuser" WITH SUPERUSER'
docker exec -it postgres psql -U tmpsuperuser -d postgres -c 'ALTER ROLE
"postgres-pavics" RENAME TO "postgres-birdhouse"'
docker exec -it postgres psql -U tmpsuperuser -d postgres -c 'ALTER ROLE
"postgres-birdhouse" WITH PASSWORD '\''postgres-qwerty'\'
docker exec -it postgres psql -U postgres-birdhouse -d postgres -c 'DROP
ROLE "tmpsuperuser"'
    ```

Note that the ``postgres-qwerty`` value is meant just for illustration,
you should replace this with the value of
    the ``BIRDHOUSE_POSTGRES_PASSWORD`` variable.
Note that you'll need to do the same for the ``stac-db`` service as well
(assuming that you weren't previously
    overriding the ``STAC_POSTGRES_USER`` with a custom value).
## Changes

**Non-breaking changes**
- comment changes
- most changes are backwards compatible (but not all)

**Breaking changes**
- variable, file, constant name changes

## Related Issue / Discussion

- Resolves #414

## Additional Information

Links to other issues or sources.


- [x] Options/flags that modify how the compose script behaves should
all use BIRDHOUSE_ as prefix
- [x] manually test upgrading an instance using the migration guide
(above)
- [x] manually test autodeploy (note: we need to automate this somehow)
- [ ] update associated projects:
    - [ ] https://github.com/DACCS-Climate/marble_client_python
    - [ ] https://github.com/bird-house/pavics-jupyter-base
- [ ] https://github.com/ouranosinc/pavics-datacatalog (deprecated, low
priority)

birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants