This tutorial will teach you how to run OPAL using the official docker images.
We also have another tutorial for running OPAL with an example docker-compose configuration. This other tutorial is better for learning about OPAL in a live playground environment.
Use this tutorial if you |
|
Use the other tutorial if you |
|
Our recommendation is to start with the docker-compose playground (quicker setup, better as a first tutorial) and then come back here and learn how to setup OPAL with a real configuration.
- Download OPAL images from Docker Hub
- Before you begin
- How to run OPAL Server
- How to run OPAL Client
- How to push data updates from an authoritative source
Image Name | How to Download | Description |
---|---|---|
OPAL Server | docker pull authorizon/opal-server |
|
OPAL Client | docker pull authorizon/opal-client |
|
OPAL Client (Standalone) | docker pull authorizon/opal-client-standalone |
|
Since running OPAL is simply spinning docker containers, OPAL is cloud-ready and can fit in many environments: AWS (ECS, EKS, etc), Google Cloud, Azure, Kubernetes, etc.
Each environment has different instructions on how to run container-based applications, and as such, environment-specific instructions are outside the scope of this tutorial. We will show you how to run the container locally with docker run
, and you can then apply the necessary changes to your runtime environment.
We at authorizon currently run our OPAL production cluster using the following services:
- AWS ECS Fargate - for container runtime.
- AWS Secrets Manager - to store sensitive OPAL config vars.
- AWS Certificate Manager - for HTTPS certificates.
- AWS ELB - for load balancer.
Example docker run command (no worries, we will show real commands later):
docker run -it \
-v ~/.ssh:/root/ssh \
-e "OPAL_AUTH_PRIVATE_KEY=$(OPAL_AUTH_PRIVATE_KEY)" \
-e "OPAL_AUTH_PUBLIC_KEY=$(OPAL_AUTH_PUBLIC_KEY)" \
-e "OPAL_POLICY_REPO_URL=$(OPAL_POLICY_REPO_URL)" \
-p 7002:7002 \
authorizon/opal-server
This command | In production environments |
---|---|
Runs the docker container in interactive mode | Typically no such option |
Mounts the ~/.ssh dir as volume |
Varies between environment, e.g in AWS ECS you would mount volumes via the task definition. |
Passes the following env vars to the docker container as config: OPAL_AUTH_PRIVATE_KEY , OPAL_AUTH_PUBLIC_KEY , OPAL_POLICY_REPO_URL . |
Varies between environment, e.g in AWS ECS you would specify env vars values via the task definition. |
Exposes port 7002 on the host machine. | Varies between environment, e.g in AWS ECS you would specify exposed ports in the task definition, and will have to expose these ports via a load balancer. |
We will now explain how to pass configuration variables to OPAL.
- In its dockerized form, OPAL server and client containers pick up their configuration variables from environment variables prefixed with
OPAL_
(e.g:OPAL_DATA_CONFIG_SOURCES
,OPAL_POLICY_REPO_URL
, etc). - The OPAL CLI can pick up config vars from either environment variables prefixed with
OPAL_
or from CLI arguments (interchangable).- Supported CLI options are listed in
--help
. - Each cli argument can match to a corresponding environment variable:
- Simply convert the cli argument name to SCREAMING_SNAKE_CASE, and prefix it with
OPAL_
. - Examples:
--server-url
becomesOPAL_SERVER_URL
--data-config-sources
becomesOPAL_DATA_CONFIG_SOURCES
- Simply convert the cli argument name to SCREAMING_SNAKE_CASE, and prefix it with
- Supported CLI options are listed in
You should read and understand OPAL Security Model before going to production.
However will list the mandatory checklist briefly here as well:
- OPAL server should always be protected with a TLS/SSL certificate (i.e: HTTPS).
- OPAL server should always run in secure mode - meaning JWT token verification should be active.
- OPAL server should be configured with a master token.
- Sensitive configuration variables (i.e: environment variables with sensitive values) should always be stored in a dedicated Secret Store
- Example secret stores: AWS Secrets Manager, HashiCorp Vault, etc.
- NEVER EVER EVER store secrets as part of your source code (e.g: in your git repository).
This section explains how to run OPAL Server.
If you run the docker image locally, you need docker installed on your machine.
Run this command to get the image:
docker pull authorizon/opal-server
If you run in a cloud environment (e.g: AWS ECS), specify authorizon/opal-server
in your task definition or equivalent.
Running the opal server container is simply a command of docker run, but we need to pipe to the OPAL server container the neccessary configration it needs via environment variables. The following sections will explain each class of configuration variables and how to set their values, after which we will demonstrate real examples.
When scaling the OPAL Server to multiple workers and/or multiple containers, we use a broadcast channel to sync between all the instances of OPAL Server. In order words, communication on the broadcast channel is communication between OPAL servers, and is not related to the OPAL client.
Under the hood, our interface to the broadcast channel backbone service is implemented by encode/broadcaster.
At the moment, the supported broadcast channel backbones are:
- Postgres LISTEN/NOTIFY
- Redis
- Kafka
Deploying the actual service used for broadcast (i.e: Redis) is outside the scope of this tutorial. The easiest way is to use a managed service (e.g: AWS RDS, AWS ElastiCache, etc), but you can also deploy your own dockers.
When running in production, you should run with multiple workers per server instance (i.e: container/node), if not multiple containers, and thus deploying the backbone service becomes mandatory for production environments.
Declaring the broadcast uri is optional, depending on whether you deployed a broadcast backbone service and are also running with more than one OPAL server instance (multiple workers or multiple nodes). If you are running with multiple server instances (you should for production), declaring the broadcast uri is mandatory.
Env Var Name | Function |
---|---|
OPAL_BROADCAST_URI |
|
As we mentioned in the previous section, each container can run multiple workers, and if you use more than one, you need a broadcast channel.
This is how you define the number of workers (pay attention: this env var is not prefixed with OPAL_
):
Env Var Name | Function |
---|---|
UVICORN_NUM_WORKERS | the number of workers in a single container (example value: 4 ) |
OPAL server is responsible to track policy changes and push them to OPAL clients.
At the moment, OPAL can tracks a git repository as the policy source.
Env Var Name | Function |
---|---|
OPAL_POLICY_REPO_URL |
|
If your tracked policy repo is private, you should declare this env var in order to authenticate and successfully clone the repo:
Env Var Name | Function |
---|---|
OPAL_POLICY_REPO_SSH_KEY |
|
For these config vars, in most cases you are good with the default values:
Env Var Name | Function |
---|---|
OPAL_POLICY_REPO_CLONE_PATH | Where (i.e: target path) to clone the repo in your docker filesystem (not important unless you mount a docker volume) |
OPAL_POLICY_REPO_MAIN_BRANCH | Name of the git branch to track for policy files (default: `master`) |
OPAL_POLICY_REPO_MAIN_REMOTE | Name of the git remote to fetch new commits from (default: `origin`) |
Currently OPAL server supports two ways to detect changes in the policy git repo:
- Polling in fixed intervals - checks every X seconds if new commits are available.
- Github Webhooks - if the git repo is stored on github - you may setup a webhook (we plan to expand to generic webhook in the near future).
You may use polling by defining the following env var to a value different than 0
:
Env Var Name | Function |
---|---|
OPAL_POLICY_REPO_POLLING_INTERVAL | the interval in seconds to use for polling the policy repo |
It is much more recommended to use webhooks if your policy repo is stored in a supported service (currently Github, we are working on expanding this). Webhooks are much more efficient with network traffic, and won't conteminate your logs.
If your server is hosted at https://opal.yourdomain.com
the webhook URL you must setup with your webhook provider (e.g: github) is https://opal.yourdomain.com/webhook
. See GitHub's guide on configuring webhooks.
Typically you would need to share a secret with your webhook provider (authenticating incoming webhooks). You can use the OPAL CLI to create a cryptographically strong secret to use.
Let's install the cli to a new python virtualenv:
pyenv virtualenv opal
pyenv activate opal
pip install opal-server
Now let's use the cli to generate a secret:
opal-server generate-secret
You must then configure the appropriate env var:
Env Var Name | Function |
---|---|
OPAL_POLICY_REPO_WEBHOOK_SECRET | the webhook secret generated by the cli (or any other secret you pick) |
The OPAL server serves the base data source configuration for OPAL client. The configuration is structured as directives for the client, each directive specifies what to fetch (url), and where to put it in OPA data document hierarchy (destination path).
The data sources configured on the server will be fetched by the client every time it decides it needs to fetch the entire data configuration (e.g: when the client first loads, after a period of disconnection from the server, etc). This configuration must always point to a complete and up-to-date representation of the data (not a "delta").
You'll need to configure this env var:
Env Var Name | Function |
---|---|
OPAL_DATA_CONFIG_SOURCES | Directives on how to fetch the data configuration we load into OPA cache when OPAL client starts, and where to put it. |
The value of the data sources config variable is a json encoding of the ServerDataSourceConfig pydantic model.
{
"config": {
"entries": [
{
"url": "https://api.authorizon.com/v1/policy-config",
"topics": [
"policy_data"
],
"config": {
"headers": {
"Authorization": "Bearer FAKE-SECRET"
}
}
}
]
}
}
Let's break down this example value (check the schema for more options):
Each object in entries
(schema: DataSourceEntry) is a directive that tells OPAL client to fetch the data and place it in OPA cache using the Data API.
- From where to fetch: we tell OPAL client to fetch data from the authorizon API (specifically, from the
policy-config
endpoint). - how to fetch (optional): we can direct the client to use a specific configuration when fetching the data, for example here we tell the client to use a specific HTTP Authorization header with a bearer token in order to authenticate to the API.
- Where to place the data in OPA cache: although not specified, this entry uses the default of
/
which means at the root of OPA document hierarchy. You can specify another path withdst_path
(check the schema).
You can use the python method of json.dumps()
to get a one line string:
❯ ipython
In [1]: x = {
...: "config": {
...: "entries": [
...: ... # removed for brevity
...: ]
...: }
...: }
In [2]: import json
In [3]: json.dumps(x)
Out[3]: '{"config": {"entries": [{"url": "https://api.authorizon.com/v1/policy-config", "topics": ["policy_data"], "config": {"headers": {"Authorization": "Bearer FAKE-SECRET"}}}]}}'
Placing this value in an env var:
export OPAL_DATA_CONFIG_SOURCES='{"config": {"entries": [{"url": "https://api.authorizon.com/v1/policy-config", "topics": ["policy_data"], "config": {"headers": {"Authorization": "Bearer FAKE-SECRET"}}}]}}'
Please be advised, this will not work so great in docker-compose. Docker compose does not know how to deal with env vars that contain spaces, and it treats single quotes (i.e: ''
) as part of the value. But with docker run
you should be fine.
Since OPAL_DATA_CONFIG_SOURCES
often contains secrets, in production you should place it in a secrets store.
In this step we show how to configure the OPAL server security parameters.
Declaring these parameters and passing them to OPAL server will cause the server to run in secure mode, which means client identity verification will be active. All the values in this section are sensitive, in production you should place them in a secrets store.
In a dev environment, secure mode is optional and you can skip this section.
However, in production environments you should run in secure mode.
Using a utility like ssh-keygen we can easily generate the keys (on Windows try SSH-keys Windows guide).
ssh-keygen -t rsa -b 4096 -m pem
follow the instructions to save the keys to two files.
Env Var Name | Function |
---|---|
OPAL_AUTH_PRIVATE_KEY |
|
OPAL_AUTH_PUBLIC_KEY |
|
Example values:
If your private key looks like this (we redacted most of the key)
-----BEGIN OPENSSH PRIVATE KEY-----
XXX...
...
...XXX==
-----END OPENSSH PRIVATE KEY-----
Declare it like this (notice how we simply replace new lines with underscores):
export OPAL_AUTH_PRIVATE_KEY=-----BEGIN OPENSSH PRIVATE KEY-----_XXX..._..._...XXX==_-----END OPENSSH PRIVATE KEY-----
For public keys, it should be something like this:
export OPAL_AUTH_PUBLIC_KEY=ssh-rsa XXX ... XXX== [email protected]
You can choose any secret you'd like, but as we've showed you before, the OPAL CLI can be used to generate cryptographically strong secrets easily.
opal-server generate-secret
You must then configure the master token like so
Env Var Name | Function |
---|---|
OPAL_AUTH_MASTER_TOKEN | the master token generated by the cli (or any other secret you pick) |
To summarize, the previous steps guided you on how to pick the values of the configuration variables needed to run OPAL server.
We will now recap with a real example.
docker pull authorizon/opal-server
Multiple workers and broadcast channel (example values from step 2):
export OPAL_BROADCAST_URI=postgres://localhost/mydb
export UVICORN_NUM_WORKERS=4
Policy repo (example values from step 3):
export OPAL_POLICY_REPO_URL=https://github.com/authorizon/opal-example-policy-repo
Policy repo syncing with webhook (example values from step 4):
export OPAL_POLICY_REPO_WEBHOOK_SECRET=-cBlFnldg7WCGlj0jsivPWPA5vtfI2GWmp1wVx657Vk
Data sources configuration (example values from step 5):
export OPAL_DATA_CONFIG_SOURCES='{"config": {"entries": [{"url": "https://api.authorizon.com/v1/policy-config", "topics": ["policy_data"], "config": {"headers": {"Authorization": "Bearer FAKE-SECRET"}}}]}}'
Security parameters (example values from step 6):
export OPAL_AUTH_PRIVATE_KEY=-----BEGIN OPENSSH PRIVATE KEY-----_XXX..._..._...XXX==_-----END OPENSSH PRIVATE KEY-----
export OPAL_AUTH_PUBLIC_KEY=ssh-rsa XXX ... XXX== [email protected]
export OPAL_AUTH_MASTER_TOKEN=8MHfUU2rzRB59pdOHNNVVw3XLe3gl9YNw7vIXxJZNJo
docker run -it \
--env OPAL_BROADCAST_URI \
--env UVICORN_NUM_WORKERS \
--env OPAL_POLICY_REPO_URL \
--env OPAL_POLICY_REPO_WEBHOOK_SECRET \
--env OPAL_DATA_CONFIG_SOURCES \
--env OPAL_AUTH_PRIVATE_KEY \
--env OPAL_AUTH_PUBLIC_KEY \
--env OPAL_AUTH_MASTER_TOKEN \
-p 7002:7002 \
authorizon/opal-server
As we mentioned before, in production you will not use docker run
.
Deployment looks somewhat like this:
- Declare your container configuration in code, e.g: AWS ECS task definition file, Helm chart, etc.
- All the secrets and sensitive vars should be fetched from a secrets store.
- Deploy your task / helm chart, etc to your cloud environment.
- Expose the server to the internet with HTTPS (i.e: use a valid SSL/TLS certificate).
- Keep your master token in a safe location (you will need it shortly to generate identity tokens).
Great! we have OPAL Server up and running. Let's continue and explains how to run OPAL Client.
Run this command to get the image that comes with built-in OPA (recommended if you don't already have OPA installed in your environment):
docker pull authorizon/opal-client
If you run in a cloud environment (e.g: AWS ECS), specify authorizon/opal-client
in your task definition or equivalent.
Otherwise, if you are already running OPA in your environment, run this command to get the standalone client image instead:
docker pull authorizon/opal-client-standalone
In production environments, OPAL server should be running in secure mode, and the OPAL client must have a valid identity token (which is a signed JWT) in order to successfully connect to the server.
Obtaining a token is easy. You'll need the OPAL server's master token in order to request a JWT token.
Let's install the opal-client
cli to a new python virtualenv (assuming you didn't already create one):
# this command is not necessary if you already created this virtualenv
pyenv virtualenv opal
# this command is not necessary if the virtualenv is already active
pyenv activate opal
# this command installs the client cli
pip install opal-client
You can obtain a client token with this cli command:
opal-client obtain-token MY_MASTER_TOKEN --uri=https://opal.yourdomain.com --type client
This example assumes that:
- You deployed OPAL server to
https://opal.yourdomain.com
- The master token of your deployment is
MY_MASTER_TOKEN
.- However, if you followed our tutorial for the server, you probably generated one here and that is the master token you should use.
example output:
{
"token": "eyJ0...8wsk",
"type": "bearer",
"details": { ... }
}
Put the generated token value (the one inside the token
key) into this environment variable:
Env Var Name | Function |
---|---|
OPAL_CLIENT_TOKEN | The client identity token (JWT) used for identification against OPAL server. |
Example:
export OPAL_CLIENT_TOKEN=eyJ0...8wsk
Set the following environment variable according to the address of the deployed OPAL server:
Env Var Name | Function |
---|---|
OPAL_SERVER_URL | The internet address (uri) of the deployed OPAL server. In production, you must use an https:// address for security. |
Example, if the OPAL server is available at https://opal.yourdomain.com
:
export OPAL_SERVER_URL=https://opal.yourdomain.com
You can configure which topics for data updates the client will subscribe to. This is great if you want more granularity in your data model, for example:
- Enabling multi-tenancy: you deploy each customer (tenant) with his own OPA agent, each agent's OPAL client will subscribe only to the relevant tenant's topic.
- Sharding large datasets: you split a big data set (i.e: policies based on user attributes and you have many users) to many instances of OPA agent, each agent's OPAL client will subscribe only to the relevant's shard topic.
If you do not specify data topics in your configuration, OPAL client will automatically subscribe to a single topic: policy_data
(the default).
Use this env var to control which topics the client will subscribe to:
Env Var Name | Function |
---|---|
OPAL_DATA_TOPICS | data topics delimited by comma (i,e: , ) |
Example value:
export OPAL_DATA_TOPICS=topic1,topic2,topic3
If you are running with inline OPA (meaning OPAL client runs OPA for you in the same docker image), you can change the default parameters used to run OPA.
In order to override default configuration, you'll need to set this env var:
Env Var Name | Function |
---|---|
OPAL_INLINE_OPA_CONFIG | The value of this var should be an OpaServerOptions pydantic model encoded into json string. The process is similar to the one we showed on how to encode the value of OPAL_DATA_CONFIG_SOURCES. |
If OPA is deployed separately from OPAL (i.e: using the standalone image), you should define the URI of the OPA instance you want to manage with OPAL client with this env var:
Env Var Name | Function |
---|---|
OPAL_POLICY_STORE_URL | The internet address (uri) of the deployed standalone OPA. |
Example, if the standalone OPA is available at https://opa.billing.yourdomain.com:8181
:
export OPAL_POLICY_STORE_URL=https://opa.billing.yourdomain.com:8181
Let's recap the previous steps with example values:
First, download opal client docker image:
docker pull authorizon/opal-client
Then, declare configuration with environment variables:
# let's say this is the (shortened) token we obtained from opal server
export OPAL_CLIENT_TOKEN=eyJ0...8wsk
# and this is where we deployed opal server
export OPAL_SERVER_URL=https://opal.yourdomain.com
# and let's say we subscribe to a specific tenant's data updates (i.e: `tenant1`)
export OPAL_DATA_TOPICS=policy_data/tenant1
and let's assume we run opa inline with the default options.
docker run -it \
--env OPAL_CLIENT_TOKEN \
--env OPAL_SERVER_URL \
--env OPAL_DATA_TOPICS \
-p 7000:7000 \
-p 8181:8181 \
authorizon/opal-client
Please notice opal client exposes two ports when running opa inline:
- OPAL Client (port
:7000
) - the OPAL client API (i.e: healthcheck, etc). - OPA (port
:8181
) - the port of the OPA agent (OPA is running in server mode).
Same instructions as for OPAL server.
Now that OPAL is live, we can use OPAL server to push updates to OPAL clients in real time.