Create a load testing command #135

StoneDot · 2023-06-18T14:30:49Z

Load testing command

Background

The load testing command is useful in understanding DynamoDB behaviors, for example, throttling, auto-scaling, metrics, etc. Also, it helps users to investigate an application's behavior when throttling happens.

Proposed design

The decisions in the implementations are the followings;

The amount of request traffic is controlled by leaky bucket algorithm with a feedback loop that adjusts the next amount of acquisition by actual consumed capacity.
The current consumed capacity is updated and presented in real time. But, in the first implementation, we will omit visualization like a graph.
To prevent consuming capacity unintentionally, RCU and WCU must be provided by the user.
The internal request manager controls the maximum parallel request to DynamoDB. It has a responsibility to scale in or out the number of parallel requests. It scales requests exponentially with base 2.

Interface

At first implementation, load testing functionality is provided with the command, dy bench run or dy benchmark run and provided options are the following;

--rcu <number>: Specify target RCU when reading items. This is a required argument.
--wcu <number>: Specify target WCU when writing items. This is a required argument if you do not provide --skip-item-createion.
--size <number>: The preferred size of an attribute in bytes. The default value is 500.
--skip-item-creation: By default, dynein creates items first for the writing test, and then, performs the read tests by using created items. This option skips wcu testing and uses the data stored on the table.
--partition-key-variations <number>: The maximum number of primary key variations of items. The default value is 1000.
--sort-key-variations <number>: The maximum number of sort key variations of items. The default value is 100.
--duration-write <number>: The duration of the write testing. The default value is five minutes.
--duration-read <number>: The duration of the read testing. The default value is five minutes.

Common options like --table, --region, etc are considered as well as other commands.

We use a bench run subcommand for initial implementation. Please note that we have room of feature enhancements. For example, we can use dy bench run -s <scenario-file> for scenario based tests and dy bench report <report-file> for showing a result of a test.

The workflow

The workflow of the load testing is schematically described as the followings;

Based on the --item-variations argument, create a list of primary keys to use in the test. In the case in which --skip-item-creation is provided, Scan APIs are invoked to list primary keys. We must use parallel scans because sequential scans create a hot partition.
Based on the --wcu argument, PutItem are invoked with the primary keys created by the first step for the duration of --duration-write. An item created has an additional string attribute with --size bytes.
Based on the --rcu argument, GetItem are invoked with the primary keys created by the first step for the duration of --duration-read.

The text was updated successfully, but these errors were encountered:

ryota-sakamoto · 2023-06-20T14:05:40Z

Thank you for creating proposal of great feature.

I think other command have followed the format like dy <verb> or dy <command> <verb> in general. What kind of other sub command do you have rather than simple?

StoneDot · 2023-06-20T15:28:39Z

I think other command have followed the format like dy or dy in general. What kind of other sub command do you have rather than simple?

I have some ideas regarding scenario base benchmarking. I suppose it will be invoked by dy benchmark scenario command. Its command style is the same as dy admin create table. I understand that it is a little awkward as an English phrase, but I feel dy benchmark table simply is a little verbose. I am willing to take in good suggestions for the command name.

StoneDot · 2023-06-20T16:14:42Z

I mention the YCSB command style as an option. I think it will be dy benchmark load to load the data and dy benchmark run to run the workload if we implement its style in dynein. The pros are compatibility with YCSB, and the cons are that we should separately run loading and testing. But I prefer dy benchmark simple.

ryota-sakamoto · 2023-06-23T17:01:22Z

I think we need to provide some command like show result of load testing.
I'm not sure how to run scenario base test for now. But I have two ideas that we provide simple test and scenario base test.

all in one

The idea is that we can run simple test and scenario base test within one command. If we specify the test file to run scenario base, I can imagine kind of command as follows. It is just simple interface.

# simple test
$ dy load run --rcu 100 --wcu 5

# scenario base test
$ dy load run -s <scenario-file>

# show result of load test
$ dy load report <report-file>

split command

The idea is that we provide two command load and benchmark. The role of each command is clearly.

# simple test
$ dy load run --rcu 100 --wcu 5
$ dy load report <report-file>

# scenario base test
$ dy benchmark run <scenario-file>
$ dy benchmark report <report-file>

StoneDot · 2023-06-26T14:07:36Z

I personally find the -s option to be a clear and effective way of specifying scenario-based testing. Also, it makes sense to split the run and report commands. Thank you for your suggestion. However, I'm a bit concerned that the load argument might confuse users since it has multiple meanings. In other words, I worry that users might mix up loading the data and loading DynamoDB for stress testing.

In my opinion, using the term benchmark (maybe even a shorter version like bench) would be clearer than load. What do you think?

Additionally, I would like to propose the following commands:

# Perform a simple test
$ dy bench run --rcu 100 --wcu 5

# Conduct a scenario-based test (not implemented in the initial phase)
$ dy bench run -s <scenario-file>

# Generate a report for the load test (not implemented in the initial phase)
$ dy bench report <report-file>

Please let me know what you think about these suggestions and proposed commands.

ryota-sakamoto · 2023-06-27T08:08:07Z

I agree with you. The idea that using benchmark or bench instead of load is clearly and easy to understand.

StoneDot · 2023-09-29T14:24:06Z

Based on the internal discussion with Solution Architect, the following features are preferable.

Specify the maximum number of concurrent requests instead of RCU and WCU.
Specify primary keys to use load testing.

He want similar functionality as what the following project provides.
https://github.com/aws-samples/dynamodb-consumed-capacity-check-tool

StoneDot self-assigned this Jun 18, 2023

StoneDot added the enhancement New feature or request label Aug 24, 2023

StoneDot added this to the v0.4.0 milestone Sep 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a load testing command #135

Create a load testing command #135

StoneDot commented Jun 18, 2023 •

edited

Loading

ryota-sakamoto commented Jun 20, 2023

StoneDot commented Jun 20, 2023

StoneDot commented Jun 20, 2023

ryota-sakamoto commented Jun 23, 2023 •

edited

Loading

StoneDot commented Jun 26, 2023

ryota-sakamoto commented Jun 27, 2023

StoneDot commented Sep 29, 2023

Create a load testing command #135

Create a load testing command #135

Comments

StoneDot commented Jun 18, 2023 • edited Loading

Load testing command

Background

Proposed design

Interface

The workflow

ryota-sakamoto commented Jun 20, 2023

StoneDot commented Jun 20, 2023

StoneDot commented Jun 20, 2023

ryota-sakamoto commented Jun 23, 2023 • edited Loading

all in one

split command

StoneDot commented Jun 26, 2023

ryota-sakamoto commented Jun 27, 2023

StoneDot commented Sep 29, 2023

StoneDot commented Jun 18, 2023 •

edited

Loading

ryota-sakamoto commented Jun 23, 2023 •

edited

Loading