Year 1 Development Phases #9

kabilar · 2024-06-11T16:03:11Z

Phase 1
- Prototype deployment with Heroku
- Prototype deployment on AWS EC2
Phase 2 - Production
- Configure CircleCI deployment (i.e. Docker image builds)
- Configure SSL certificate
- Configure front-end so that AWS Access Key ID and and Secret Access Key are not required by user
- Update lincbrain.org front-end to include a copy button for S3 URI of an asset
- Streamline authentication workflow
  - Create Webknossos object when a LINC API user is APPROVED
  - Retroactively update existing Webknossos users
  - For users who are not logged in, update webknossos.lincbrain.org to provide instructions to log into lincbrain.org.
  - Remove Webknossos log in fields
  - Add webknossos.lincbrain.org link to the homepage top navigation bar
  - Add fetch mechanism for cookie in the file browser and homepage
  - Fix cookie race condition
- Automated pushes of Docker images to DockerHub upon pull request merges with CircleCI
  - Currently this is set up to push the master and dev branches to DockerHub. (Can potentially clean up dev tags in the future if necessary.)
- Configure backups
  - Determine components to backup - Fossildb, Postgres, /binaryData, /persistent
  - Determine backup size
  - Determine frequency
    - Currently backups are daily and moved to Glacier immediately (source code)
    - If backup sizes grow exponentially we can switch to rolling 7 day backups and after 7 days store weekly backups in Deep Glacier.
  - Determine how to load backup
  - Validate backup restore with fresh setup
  - Document back and restore processes
- Determine maintenance strategy
  - Updates to the container will be manual during announced downtimes.
  - Data migration with new versions will occur by following the Webknossos SQL scripts.
  - Rollbacks (if necessary) will be performed by standing up new instance and moving over the URL.
- Determine and optimize scalability and costs
  - Test with 2-3 concurrent users with high-resolution histology Zarr image (7 resolution levels, ~0.4 micron)
  - Increase RAM and CPU resources
    - ~~Deploy on r5.2xlarge~~
    - Deploy on r5.4xlarge
  - Increase Java Heap Size
  - Increase EBS volume
  - Configure API to work asynchronously when retrieving Zarr chunks from the AWS S3 bucket
  - Turn off instance using AWS Lambda from 8pm - 5am EST
    - Fix FossilDB version for reboots
- Update LINC Docs
  - Add instructions for creating a Webknossos dataset (from the Google Doc)
  - Add hyperlinks to the deployment and development docs.
- View multiple segmentations simultaneously
  - Segment IDs allow for viewing multiple segmentations in a single layer but the segmentations cannot overlap. For further details see Question: Can multiple segment IDs be assigned to the same voxel? scalableminds/webknossos#8032.
  - It is not currently possible to view multiple segmentation layers simultaneously. For further details see [Follow up]: Multiple segmentation layers scalableminds/webknossos#5695.
  - Annotations overlap for retrospective data. Users can visualize this data in Webknossos as dataset layers.
  - Annotations will not overlap for prospective data. Users will annotate a single volume/segmentation layer with multiple segment IDs.
- Layers disappear at higher resolution levels (4-4-1, 2-2-1, 1-1-1)
- Disable the Upload Datasets tab
- Disable the Upload Annotation(s) button
- Map the LINC Data Platform datasets to Webknossos datasets
  - Datasets are manually added to Webknossos as described in the LINC docs.
  - On the LINC Data Platform, users can navigate to a Webknossos dataset or annotation by selecting the green Webknossos drop down menu next to an asset:
  - Versioning of annotations
    - Multiple users will annotate a given dataset. The naming scheme is being discussed in Determine BIDS-friendly naming scheme for Webknossos annotations #20.
    - For a given dataset and annotator, multiple versions do not need to be saved.
    - In Webknossos it is possible to change the underlying dataset layer, and thereby the annotations could be considered stale. After discussion with the team, for the histology images it is not likely that the underlying dataset layers will be updated so this case does not need to be handled.
- Update permissions
  - Every user can read every dataset and every annotation. Users is apart of the Default team and set up with Dataset Manager permissions upon account creation.
  - Change all users from Admin to Dataset Managers
Future work
- Automate download of volume annotation layers, conversion of wkw to zarr with wkcuber, and upload to lincbrain.org
  - This issue is now tracked in Export Webknossos annotations to OME-Zarr or NIfTI-Zarr #16. At this point I don't think that we need to automate this process, since it is not clear when annotations would be ready for upload to lincbrain.org. As long as we have a script, we can seamlessly export, convert, and upload the annotations.
- Clean up current test datasets and annotations
  - This issue is now tracked in Clean up existing test Webknossos datasets and annotations #17
- As apart of our LINC infrastructure monthly expense reports, we will monitor costs and can potentially switch to an EC2 Reserved Instance to reduce costs.
- Test with optimized dataset that is chunked to 32x32
  - Discussion with the Webknossos team and their documentation suggest that 32 is an optimal chunk size (full range is 32-128). We can evaluate switching to 32^3 when we get Zarr sharding in place otherwise there will be too many files.
- Add guidelines for the naming convention of datasets with multiple layers and annotations
  - This issue is now tracked in Determine BIDS-friendly naming scheme for Webknossos datasets with multiple imaging layers #19 and Determine BIDS-friendly naming scheme for Webknossos annotations #20.
- Upload layers to existing datasets scalableminds/webknossos#7444
Not implemented
- Match the threading options to the instance type
- Only selected users can have write permissions on select annotations
  - This is not currently needed as multiple annotators will not be editing the same annotation
  - Users have to be manually added to annotations to be able to edit. Setting up this feature would require separate teams.

The text was updated successfully, but these errors were encountered:

aaronkanzer · 2024-06-18T14:13:43Z

@kabilar merged #13 and merged into production (still need to update dev -> staging -> prod workflow)

Should finalize handling of use case of: Configure front-end so that AWS Access Key ID and and Secret Access Key are not required by user

kabilar · 2024-06-25T18:39:58Z

Hi @aaronkanzer, I updated our list above. Phases 3 and/or 4 may get pushed into year 2.

aaronkanzer · 2024-06-26T16:17:54Z

Hi @kabilar -- a couple updates/progress here for Phase 2:

Automated pushes of Docker images to DockerHub upon pull request merges to the master branch

Right now, we are configured to push each branch, no matter master or not -- I think I'd like to keep it this way actually -- as we need to push containers remotely to best test them in staging. Perhaps we can discuss producing a CHANGELOG for what ends up in production

I'm also planning to follow alongside WebKNOSSOS migrations documentation, as it prescribes any updates we might need to do on our end if we sync upstream with our fork

How will we update Docker container? Will there be downtime?

We will have minimal downtime -- this should be as simple as:

Pull down the newer Docker image of the API
Stop the pre-existing API image, kill image to fully pause
Replace the tag referenced in the docker-compose.yml for the API reference link here
Run docker-compose up webknossos to launch new version

As long as the newer version is compatible with pre-existing schemas (e.g. postgres, fossildb), this should work as intended. We will have staging as a safeguard for any issues of course.

Configure backups

Currently working through this -- I've been using scalableminds fossildb-client for commands to properly backup and restore see here for code reference -- still stress-testing/confirming that the workflow can restore properly.

We will know more in terms of memory size, but it seems only a couple KB per annotation in FossilDB.

For Postgres, this is the standard pg_dump scenario (similar to LINC Archive).

For both, my hope is to have a cron job on the EC2 instance that exports the backups to S3 routinely. We can discuss the cadence for the cron job.

I'm going to continue to update #10 with more playbook-ish docs for data restoration/backup -- seems we are in a good place overall.

Let me know if you have any questions in the meantime.

aaronkanzer · 2024-06-26T20:06:07Z

@kabilar A couple other fun updates (albeit in Phase 4, but still good in terms of understanding how WebKNOSSOS<>LINC will eventually communicate)

In this commit, I introduced SameSite of .lincbrain.org for our WebKNOSSOS auth-related cookies 🍪

(Assuming you are logged into staging), you can now see GET payloads for datasets and annotations for example -- we can ping these endpoints to reflect whatever need-be in LINC Data Platform

A couple of other useful notes here:
• WebKNOSSOS datasets are name-unique, thus if "dataset-zarr-hipct" exists, a user can't come along and create "dataset-zarr-hipct" for a different dataset -- this could get tricky for drafts, etc. but something to think about in terms of naming conventions

• We will still need to determine the workflow for how an end user gets the cookie initially -- right now, when you log into WebKNOSSOS, the expiration of the cookie is 1 year -- we can alter as needed. There is a POST route for /auth/login for WebKNOSSOS, but that logic could be more complex -- can look into further.

Nevertheless, good news here in terms of how we want to guide LINC users towards annotations and datasets in WebKNOSSOS

kabilar · 2024-06-27T14:51:56Z

Awesome, thanks for breaking down each component. This is very helpful. I have some naive questions that we can discuss today.

aaronkanzer · 2024-06-28T13:20:07Z

Update lincbrain.org front-end to include a copy button for S3 URI of an asset

Handled via code mostly in lincbrain/linc-archive#175

aaronkanzer · 2024-09-13T18:14:15Z

@kabilar proper backup scripts and cronjob definitions should be all set 📈 68768f1

kabilar · 2024-10-18T15:15:35Z

Hi @aaronkanzer, I have changed all users (except for you and I) from Admin back to Dataset Manager.

kabilar · 2024-10-18T16:04:50Z

Hi @aaronkanzer, if I recall correctly, there was an issue with the FossilDB versions when pulling from upstream. This issue was noticeable when we rebooted Webknossos. Were you able to resolve this issue? Thanks.

aaronkanzer · 2024-10-18T17:11:18Z

Hi @aaronkanzer, if I recall correctly, there was an issue with the FossilDB versions when pulling from upstream. This issue was noticeable when we rebooted Webknossos. Were you able to resolve this issue? Thanks.

@kabilar I haven't noticed any issues since downgrading from FossilDB v489 to v484, so i'm going to consider things stable as such -- did you recall seeing this issue recently?

kabilar · 2024-10-18T17:51:32Z

@kabilar I haven't noticed any issues since downgrading from FossilDB v489 to v484, so i'm going to consider things stable as such -- did you recall seeing this issue recently?

Thanks Aaron. Sounds great. I haven't seen this issue recently.

kabilar · 2024-10-18T19:20:51Z

Thanks @aaronkanzer for the huge effort on the Webknossos deployment. We can consider the above features complete.

Hi @lincbrain/comp-team, we can now consider our deployment of Webknossos to be in production. A few notes below:

We deployed Webknossos based on the requirements from our design doc: https://github.com/lincbrain/linc-archive/pull/159/files
See the first comment from this issue for all of the features that were built, deployed, and tested.
The LINC Docs have been updated with an Add Webknossos Dataset page.
Future work for Webknossos is currently tracked under other issues in this repository. Feel free to file an issue for any bug fixes or feature requests.

Thank you.

kabilar added the enhancement New feature or request label Jun 11, 2024

kabilar assigned kabilar and aaronkanzer Jul 3, 2024

kabilar removed the enhancement New feature or request label Jul 3, 2024

kabilar closed this as completed Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Year 1 Development Phases #9

Year 1 Development Phases #9

kabilar commented Jun 11, 2024 •

edited

Loading

aaronkanzer commented Jun 18, 2024

kabilar commented Jun 25, 2024

aaronkanzer commented Jun 26, 2024

aaronkanzer commented Jun 26, 2024 •

edited

Loading

kabilar commented Jun 27, 2024

aaronkanzer commented Jun 28, 2024

aaronkanzer commented Sep 13, 2024

kabilar commented Oct 18, 2024

kabilar commented Oct 18, 2024

aaronkanzer commented Oct 18, 2024

kabilar commented Oct 18, 2024

kabilar commented Oct 18, 2024

Year 1 Development Phases #9

Year 1 Development Phases #9

Comments

kabilar commented Jun 11, 2024 • edited Loading

aaronkanzer commented Jun 18, 2024

kabilar commented Jun 25, 2024

aaronkanzer commented Jun 26, 2024

aaronkanzer commented Jun 26, 2024 • edited Loading

kabilar commented Jun 27, 2024

aaronkanzer commented Jun 28, 2024

aaronkanzer commented Sep 13, 2024

kabilar commented Oct 18, 2024

kabilar commented Oct 18, 2024

aaronkanzer commented Oct 18, 2024

kabilar commented Oct 18, 2024

kabilar commented Oct 18, 2024

kabilar commented Jun 11, 2024 •

edited

Loading

aaronkanzer commented Jun 26, 2024 •

edited

Loading