This documentation is specific to running the forecasting models for US and Washington state to update this page on the SFA public website.
Running the commands below requires appropriate permission on S3, AWS Batch, and Cloudwatch, and setting env vars AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, AWS_SESSION_TOKEN
; or AWS_PROFILE
.
To run locally:
nextstrain build . --configfile config/config.yaml --config data_provenances=gisaid geo_resolutions=usa
To run on AWS, using Nextstrain's AWS Batch runtime:
nextstrain build --aws-batch --aws-batch-s3-bucket <bucket-name> --aws-batch-job <aws-batch-job-name> --aws-batch-queue <aws-batch-job-queue-name> . --configfile config/config.yaml --config data_provenances=gisaid geo_resolutions=usa
The results folder is automatically downloaded when completed. Alteratively, we can run this in the background with --detach
and retrieve the results from S3 later.
In production, ECS is used to run the command above as a scheduled task, in detached mode.
A minimal docker image including the Nextstrain CLI and contents of this repo is published to ECR and used to run the nextstrain build
commands from ECS. See README.md
An ECS task definition specifies the environment and nexstrain CLI command to run, and an ECS cluster runs this as a scheduled task.
To update the files available from the public website, an addition workflow rule can be run by appending config/sfa-optional.yaml
to the existing --configfile
argument. The config file specifies the destination bucket, triggers the additional workflow that compresses and copies the files from the source bucket to the destination, and sets the appropriate encoding and content types for zipped files to be served via CloudFront. (i.e. content-encoding:gzip and content-type:application/json
)