-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make bind-mount locations more flexible #356
Conversation
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1823/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-88.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
CHANGES.md
Outdated
- `LOGROTATE_DATA_DIR` (default: `${DATA_PERSIST_ROOT}/logrotate`) | ||
- `MONGODB_DATA_DIR` (default: `${DATA_PERSIST_ROOT}/mongodb_persist`) | ||
- `POSTGRES_DATA_DIR` (default `${DATA_PERSIST_ROOT}/frontend_persist`) | ||
- `USER_WORKSPACES` (default `${DATA_PERSIST_ROOT}/user_workspaces`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: maybe this can be named USER_WORKSPACES_DIR
to be consistent with other *_DIR
names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this variable is already in use so I kept it this way for backwards compatibility
@ChaamC do you have time to chat about whether this change will affect your work with cowbird? |
@mishaschwartz I think it would be better not to have those variables ( Also, using a customizable wps outputs directory could also break weaver. As for the other variables concerned by this PR, I haven't worked with them, but I don't think they will have an impact on Cowbird right now. From my point of view, there's just the problem of customizing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the USER_WORKSPACES
, all others should be ok to allow overrides.
@@ -36,7 +36,7 @@ services: | |||
- ./components/cowbird/config/cowbird/cowbird.ini:/opt/local/src/cowbird/config/cowbird.ini | |||
# even if not running tasks here, they must be registered to send them off to the right place! | |||
- ./components/cowbird/config/cowbird/celeryconfig.py:/opt/local/src/cowbird/config/celeryconfig.py | |||
- "${DATA_PERSIST_ROOT}/${USER_WORKSPACES}:/${USER_WORKSPACES}" | |||
- "${USER_WORKSPACES}:/user_workspaces" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned by @ChaamC, this could easily break Cowbird.
Something tricky about these volumes is that Cowbird creates dynamic hardlinks between different locations that must resolve properly once mounted. However, even though Cowbird sees the mounted location /user_workspaces
inside its docker container, the hardlinks it must create must resolve properly on the host side, such that once this link is mounted elsewhere such as in Jupyter user-specific dockers, the file still resolves correctly with a reduced/nested directory of the specific user-workspace.
@ChaamC and I have had a lot of discussions and did many exploration tests to try to resolve this already complicated problem. I would prefer not to introduce customizable paths here, to avoid Cowbird having to deal with complicated inverse-volume-mapping resolution to create the links correctly.
Something that could be considered is to have some kind of DATA_PERSIST_SHARED_DIR
(instead of DATA_PERSIST_ROOT
) where stuff like user-workspaces and wps-outputs that must share locations between services are all nested under. However, the ${DATA_PERSIST_SHARED_DIR}/user_workspaces
and ${DATA_PERSIST_SHARED_DIR}/wps_outputs
would be hardcoded to ensure they work properly.
@@ -96,7 +96,13 @@ export WEAVER_WPS_PROVIDERS_RETRY_COUNT=5 | |||
# control interval time between retries (duration in seconds, counts toward maximum timeout) | |||
export WEAVER_WPS_PROVIDERS_RETRY_AFTER=5 | |||
|
|||
export WEAVER_MONGODB_DATA_DIR='${DATA_PERSIST_ROOT}/mongodb_weaver_persist' | |||
|
|||
export WEAVER_WPS_PRIVATE_DIR='${DATA_PERSIST_ROOT}/wps_private' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one might need to be revisited.
Might be deprecated considering the switch to wpsoutputs
with nested user-context managed by Cowbird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmigneault are you suggesting that this shouldn't be included in this PR or just that this might be deprecated some time in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think it will be deprecated eventually.
@ChaamC @fmigneault Thank you both for your explanations of the relevant issue. This is what I understand from your explanation:
As long as that relationship between those directories is maintained, nothing else should be an issue, is that correct? I propose we create a |
@mishaschwartz |
@ChaamC I've made the changes suggested (I ended up naming the variable Please let me know if the solution is compatible with the cowbird changes or if I've missed anything |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1874/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-69.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1875/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-20.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1876/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-90.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
Minor comments, but looks good on my side. I tried it with my upcoming Cowbird changes and was able to make it work without trouble. FYI, I have just opened a PR with my upcoming changes #360. I will eventually update it with your changes, as I cannot merge it for the moment anyway.
It would be good to get @fmigneault's approval too, especially for the weaver part that he is more familiar with. Unfortunately he is in vacation, I think for 2 weeks.
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/1879/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-35.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
Should now include all changes from #360 |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2095/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-35.rdext.crim.ca Infrastructure deployment failed. Instance has not been destroyed. @matprov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not recall why, but WEAVER_WPS_OUTPUTS_DIR
(which depends on WPS_OUTPUTS_DIR
, DATA_PERSIST_ROOT
and now the alternative DATA_PERSIST_SHARED_ROOT
) is defined as DELAYED_EVAL
variable.
Similarly, COWBIRD_MONGODB_DATA_DIR
is in DELAYED_EVAL
.
Does that mean that all those variable dependencies should also be marked as DELAYED_EVAL
?
Since the integration of Cowbird to manage private user workspace vs public ones under a common (shared) data persistence dir, and Weaver using the same approach for WPS outputs using the Magpie hook + X-WPS-Output-Context
header, I suggest getting rid of WEAVER_WPS_PRIVATE_DIR
altogether.
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2101/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https:// Infrastructure deployment failed. Instance has not been destroyed. @matprov |
Yes, and they all should be already. If I've missed one let me know.
Ok, done |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2102/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https:// Infrastructure deployment failed. Instance has not been destroyed. @matprov |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2103/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-149.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1334/NOTEBOOK TEST RESULTS |
E2E Test ResultsDACCS-iac Pipeline ResultsBuild URL : http://daccs-jenkins.crim.ca:80/job/DACCS-iac-birdhouse/2107/Result : failure BIRDHOUSE_DEPLOY_BRANCH : flexible-bind-mount-locations DACCS_CONFIGS_BRANCH : master PAVICS_E2E_WORKFLOW_TESTS_BRANCH : master PAVICS_SDI_BRANCH : master DESTROY_INFRA_ON_EXIT : true PAVICS_HOST : https://host-140-154.rdext.crim.ca PAVICS-e2e-workflow-tests Pipeline ResultsTests URL : http://daccs-jenkins.crim.ca:80/job/PAVICS-e2e-workflow-tests/job/master/1338/NOTEBOOK TEST RESULTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review, I wasn't tagged as reviewer on this PR initially.
Spotted a wrong volume-mount path.
export DELAYED_EVAL=" | ||
$DELAYED_EVAL | ||
WEAVER_WPS_OUTPUTS_DIR | ||
WEAVER_MONGODB_DATA_DIR | ||
WEAVER_WPS_PRIVATE_DATA_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left-over not deleted WEAVER_WPS_PRIVATE_DATA_DIR
?
@@ -97,7 +96,7 @@ services: | |||
networks: | |||
- weaver-mongodb | |||
volumes: | |||
- ${DATA_PERSIST_ROOT}/mongodb_weaver_persist:/data/db | |||
- ${WEAVER_MONGODB_DATA_DIR}/mongodb_weaver_persist:/data/db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be ${WEAVER_MONGODB_DATA_DIR}:/data/db
instead?
## Overview Clean up unused variables and correct file paths from the changes made in 1.33.2 ## Changes **Non-breaking changes** Minor bug fix and code cleanup **Breaking changes** None ## Related Issue / Discussion From suggested changes: #356 (review) ## Additional Information <!-- The test suite can be run using a different DACCS config with ``birdhouse_daccs_configs_branch: branch_name`` in the PR description. To globally skip the test suite regardless of the commit message use ``birdhouse_skip_ci: true`` in the PR description. --> birdhouse_daccs_configs_branch: master birdhouse_skip_ci: false
Overview
Previously, most bind mount locations on the host machine were subdirectories of the folder specified by the
DATA_PERSIST_ROOT
environment variable (/data
by default). This change allows the user to set custom locations for the following additional variables, so that they don't need to be all under the same common directory.LOGROTATE_DATA_DIR
(default:${DATA_PERSIST_ROOT}/logrotate
)MONGODB_DATA_DIR
(default:${DATA_PERSIST_ROOT}/mongodb_persist
)COWBIRD_MONGODB_DATA_DIR
(default:${DATA_PERSIST_ROOT}/mongodb_cowbird_persist
)POSTGRES_DATA_DIR
(default${DATA_PERSIST_ROOT}/frontend_persist
)WEAVER_MONGODB_DATA_DIR
(default${DATA_PERSIST_ROOT}/mongodb_weaver_persist
)The following variable is also added which is another location on disk where files that may contain links are placed. Because the links need to be mounted together in order to resolve properly, the subdirectories of this directory are not configurable:
DATA_PERSIST_SHARED_ROOT
(default: same asDATA_PERSIST_ROOT
)The following variables now create subdirectories under
DATA_PERSIST_SHARED_ROOT
(previously they were created underDATA_PERSIST_ROOT
by default):USER_WORKSPACES
(defaultuser_workspaces
)WEAVER_WPS_OUTPUTS_DIR
(defaultwps_outputs/weaver
)Changes
Non-breaking changes
Breaking changes
None
Related Issue / Discussion
Additional Information
Links to other issues or sources.