When creating a SaaS multi-tenant systems which require websocket connections we need a way to rate limit those connections on a per tenant basis. With Amazon API Gateway you have the option to use usage plans with HTTP connections however they are not available for websockets. To enable rate limiting we can use a API Gateway Lambda Authorizer to validate a connection and control access. Using a Lambda Authorizer we can implement code to allow the system to valid connection rates and throttle inbound connections on a per tenant basis. This sample also demonstrates pool and silo modes for handling the message traffic per tenant.
- The client sends an HTTP PUT request to the Amazon API Gateway HTTP endpoint to create a session for a tenant. If required, this call can be authenticated, however, that is outside the scope of this sample.
- A Lambda function will create a session and store it in a DynamoDB table with a TTL (Time To Live) value specified. Amazon DynamoDB Streams are used to remove all session connections if no communication is sent or received over a specific period of time. Each call to the database layer is restricted by conditional keys using sts:TransitiveTagKeys so each tenant can only access its specific rows based on the tenant ID.
- Once a session is created the client will initiate a websocket connection to the Amazon AWS API Gateway WebSocket endpoint. A session can be used multiple times to create connections from multiple web browser windows. The session is used to keep these different connections in sync.
- A Lambda function is used as the Authorizer for the WebSocket connection. The authorizer will do the following:
- Validate the tenant exists.
- Validate the session exists.
- Add the tenant ID, session ID, connection ID, and tenant settings to the authorization context.
- A Lambda function is used for the connect route which throttles inbound connections and returns a 429 response code if over a limit. The following checks and processing are done:
- Over the total number of connections allowed for this tenant.
- Over the total number of connections allowed for this session.
- Over the total number of connections per minute allowed for the tenant.
- Over the total number of connections per minute allowed for the session.
- Add the connection ID to the sessions connection ID set and update the session Time to Live (TTL).
- Increment the total number of connections for the tenant.
- Messages are processed via a Siloed or Pooled FIFO Queue depending on the API Gateway route. SQS FIFO queues are used to keep messages in order. If we send messages directly to the Lambda function there is the possibility a cold start could occur on the first message delaying its processing while a following message hits a warm Lambda function causing it to process faster and return an out of order reply. The tenant ID, session ID, connection ID and tenant settings are added to each message as message metadata. SQS FIFO queues use a combination of tenant ID and session ID for the SQS message group ID to keep messages in order. Each inbound message will update the DynamoDB session TTL to reset the session timeout.
- Silo based messages are processed by the tenant’s corresponding SQS FIFO queue, which is named using the tenant ID. A Lambda function per tenant is used to read messages from the tenant’s SQS FIFO queue.
- Pool based messages are processed by a single pooled SQS FIFO queue. A Lambda function is used by all tenants to read messages from the pooled SQS FIFO queue.
- A Lambda function is used during disconnect to do the following:
- Remove the connection ID from the session connection ID set.
- Decrement the total number of connections for the tenant
- Once all connections are closed, the client will send an HTTP DELETE request to the Amazon API Gateway HTTP endpoint to remove the session.
- Apache Maven 3.8.1
- AWS CDK 1.130.0 or later installed
- git clone this repository
- In the root directory of the repository execute the command
cdk deploy
- Review the permissions and follow prompts
- After deployment the CDK will list the outputs as follows:
- APIGatewayWebSocketRateLimitStack.SampleClient
- The URI points to the sample web page described below
- APIGatewayWebSocketRateLimitStack.SessionURL
- This URI points to the endpoint which is able to create sessions
- APIGatewayWebSocketRateLimitStack.TenantURL
- This URI is only exposed for demo purposes and is used to get a list of the current tenant Ids
- APIGatewayWebSocketRateLimitStack.WebSocketURL
- This URI is the websocket connection endpoint
- APIGatewayWebSocketRateLimitStack.SampleClient
The sample can be used to test the various aspects of the system. The following steps are the happy path:
- Open the web page given as the output APIGatewayWebSocketRateLimitStack.SampleClient from the CDK deployment
- Wait for the tenant Ids to load
- Click the Create Session button to create a new session
- Click the Connect button
- Once connected try both the Send Silo and Send Pooled buttons
- The Send Silo button will send via the SiloSQS route which uses the siloed SQS FIFO queue execution model
- The Send Pooled button will send via the PooledSQS route which uses the pooled SQS FIFO queue execution model
- Click the Disconnect button to close the connection
- Click the Delete Session button to remove the current session
- In the root directory of the repository execute the command
cdk destroy
SQS FIFO queues and siloed Lambdas per tenant are used in silo mode. The API gateway will use the authorization contexts tenantId to determine the queue name per tenant. Each SQS FIFO queue has a linked Lambda function to process messages which send an echo reply.
A single SQS FIFO queue and siloed Lambdas per tenant are used in silo mode. The API gateway will use the authorization contexts tenantId to determine the queue name per tenant. Each SQS FIFO queue has a linked Lambda function to process messages which send an echo reply.
All tables access is restricted by a partition key condition to only allow access to rows for which the primary index matches the current tenantId.
The tenant table is used to store the tenantIds and option details to allow each tenant to specify different rate limits.
Fields
- tenantId (String) (Partition Key) - The tenantId
- connectionsPerSession (Number) - The max number of connections each session is allowed
- tenantConnections (Number) - The max number of connections this tenant is allowed
- sessionPerMinute (Number) - The max number of connections per minute for a session
- tenantPerMinute (Number) - The max number of connections per minute for this tenant
- sessionTTL (Number) - The session time to live value in seconds. This is used each time activity happens for a session to increase the time period before a session times out and connections are dropped. The TTL value is set as current time plus this value.
- messagesPerMinute (Number) - The total number of messages per minute this tenant is allowed to process before throttling the tenant.
The limit table is used to store the current limit counts for each tenant and also the per minute counts.
Fields
- key (String) (Partition Key) - This key field can be one of three formats
- tenantId - If the key is a single tenantId then it is tracking the total number of connections for this tenant
- tenantId:minute:{epoch} - If the key is the tenantId:minute:{epoch} then it is tracking the current number of connections per minute for the tenant within the {epoch} value start time + 60 seconds.
- tenantId:sessionId:minute:{epoch} - If the key is the tenantId:sessionId:minute:{epoch} then it is tracking the current number of connections per minute for the session within the {epoch} value start time + 60 seconds.
- itemCount (Number) - The current value for the limit
- itemTTL (Number) (TTL) - The time to live value for DynamoDB to remove this item. This is used for the per minute connection rates to remove expired rows.
The session table keeps track of sessions per tenant and will expire sessions after a set amount of time
Fields
- tenantId (String) (Partition Key) - The tenantId
- sessionId (String) (Sort Key) - The sessionId
- connectionIds (Set [String]) - The current connectionIds for this session. This is used to keep track of the number of connections per session. It is also used to send reply messages to all connections on a specific session.
- sessionTTL (Number) (TTL) - the time to live value for DynamoDB to remove this item. This value is used to removed expired sessions and disconnect any lingering connections associated.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.