-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bench web/GitHub #6
base: BenchWeb/frameworks
Are you sure you want to change the base?
Conversation
Reviewer's Guide by SourceryThis PR adds the core infrastructure and benchmarking framework for BenchWeb, including CI pipeline configuration, Docker support, database integrations, and test verification capabilities. Class diagram for Results and DockerHelper classesclassDiagram
class Results {
-benchmarker
-config
-directory
-file
-uuid
-name
-environmentDescription
-git
-startTime
-completionTime
-concurrencyLevels
-pipelineConcurrencyLevels
-queryIntervals
-cachedQueryIntervals
-frameworks
-duration
-rawData
-completed
-succeeded
-failed
-verify
+parse(tests)
+parse_test(framework_test, test_type)
+parse_all(framework_test)
+write_intermediate(test_name, status_message)
+set_completion_time()
+upload()
+load()
+get_docker_stats_file(test_name, test_type)
+get_raw_file(test_name, test_type)
+get_stats_file(test_name, test_type)
+report_verify_results(framework_test, test_type, result)
+report_benchmark_results(framework_test, test_type, results)
+finish()
}
class DockerHelper {
-benchmarker
-client
-server
-database
+clean()
+build(test, build_log_dir)
+run(test, run_log_dir)
+stop(containers)
+build_databases()
+start_database(database)
+build_wrk()
+test_client_connection(url)
+server_container_exists(container_id_or_name)
+benchmark(script, variables)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gitworkflows - I've reviewed your changes - here's some feedback:
Overall Comments:
- Some configuration files appear to be empty placeholders (.siegerc, my.cnf, etc). Please add reasonable default configurations to these files.
Here's what I looked at during the review
- 🟡 General issues: 1 issue found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 2 issues found
- 🟡 Documentation: 1 issue found
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
prefix=log_prefix, | ||
file=benchmark_log) | ||
|
||
max_time = time.time() + 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider making the container startup timeout configurable
A fixed 60 second timeout may not be suitable for all environments. Consider making this configurable or adaptive based on system resources.
|
||
## Prerequisites | ||
|
||
* **A recent version of Vagrant**, like 1.6.3 (NOTE: `apt-get` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (documentation): Fix inconsistent project name usage (BenchWeb vs BW)
The document switches between using 'BenchWeb' and 'BW'. Consider standardizing on one name or explicitly stating that BW is an abbreviation for BenchWeb.
* **A recent version of Vagrant**, like 1.6.3 (NOTE: `apt-get` is
too old, download the newest `deb` directly). See
[here](https://www.vagrantup.com/downloads.html) for downloads
* **A CPU that can virtualize a 64-bit virtual OS**, because BenchWeb
downloads a number of static binaries that require 64-bit. See
the FAQs section below for more on this. If you cannot meet this
requirement, consider using the Amazon provider (about `$1/day`)
with open(self.file, "w") as f: | ||
f.write(json.dumps(self.__to_jsonable(), indent=2)) | ||
|
||
def parse_test(self, framework_test, test_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the parse_test() method into smaller focused methods to handle different parsing responsibilities.
The parse_test()
method has deep nesting that makes it hard to follow. Consider extracting the line parsing logic into separate methods:
def parse_test(self, framework_test, test_type):
results = {'results': []}
stats = []
if not os.path.exists(self.get_raw_file(framework_test.name, test_type)):
return results
with open(self.get_raw_file(framework_test.name, test_type)) as raw_data:
is_warmup = True
current_result = None
for line in raw_data:
if self._is_new_request_block(line):
is_warmup = False
current_result = None
continue
if self._is_warmup_line(line):
is_warmup = True
continue
if not is_warmup:
current_result = self._ensure_result_dict(current_result, results)
self._parse_metric_line(line, current_result)
if self._is_end_time_line(line):
stats.append(self._generate_stats(framework_test, test_type,
current_result["startTime"], current_result["endTime"]))
self._write_stats_file(framework_test, test_type, stats)
return results
def _parse_metric_line(self, line, result):
if "Latency" in line:
self._parse_latency(line, result)
elif "requests in" in line:
self._parse_requests(line, result)
elif "Socket errors" in line:
self._parse_socket_errors(line, result)
# etc for other metrics
This refactoring:
- Reduces nesting depth
- Makes the main flow clearer
- Isolates parsing logic into focused methods
- Makes it easier to modify individual metric parsing
The functionality remains identical but the code becomes more maintainable.
run_tests = test_dirs | ||
quit_diffing() | ||
|
||
# Forced *fw-only* specific tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider refactoring the command handling logic into a registry pattern with dedicated handler functions.
The command handling logic can be simplified by using a command registry pattern. This would reduce code duplication while maintaining clarity. Here's how:
# Define command handlers
def handle_fw_only(args, test_dirs):
tests = args.strip().split(' ')
return [test for test in tests if test in test_dirs]
def handle_lang_only(args, test_dirs):
langs = args.strip().split(' ')
return [test for test in test_dirs if any(test.startswith(lang + "/") for lang in langs)]
def handle_fw(args, test_dirs):
tests = args.strip().split(' ')
return [test for test in tests if test in test_dirs]
def handle_lang(args, test_dirs):
langs = args.strip().split(' ')
return [test for test in test_dirs if any(test.startswith(lang + "/") for lang in langs)]
# Command registry
COMMANDS = {
r'\[ci fw-only (.+)\]': (handle_fw_only, True), # (handler, is_only_command)
r'\[ci lang-only (.+)\]': (handle_lang_only, True),
r'\[ci fw (.+)\]': (handle_fw, False),
r'\[ci lang (.+)\]': (handle_lang, False)
}
# Process commands
run_tests = []
for pattern, (handler, is_only) in COMMANDS.items():
if match := re.search(pattern, last_commit_msg, re.M):
tests = handler(match.group(1), test_dirs)
print(f"Tests {tests} will run based on command.")
run_tests.extend(tests)
if is_only:
quit_diffing()
This approach:
- Eliminates duplicate regex logic
- Makes it easier to add new commands
- Centralizes command handling logic
- Maintains explicit handling of each command type
@@ -0,0 +1,25 @@ | |||
db = db.getSiblingDB('hello_world') | |||
db.world.drop() | |||
for (var i = 1; i <= 10000; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (code-quality): Use const
or let
instead of var
. (avoid-using-var
)
Explanation
`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code). `let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than function-scoped.From the Airbnb JavaScript Style Guide
self.failed = dict() | ||
self.verify = dict() | ||
for type in test_types: | ||
self.rawData[type] = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace dict()
with {}
(dict-literal
)
self.rawData[type] = dict() | |
self.rawData[type] = {} |
Explanation
The most concise and Pythonic way to create a dictionary is to use the{}
notation.
This fits in with the way we create dictionaries with items, saving a bit of
mental energy that might be taken up with thinking about two different ways of
creating dicts.
x = {"first": "thing"}
Doing things this way has the added advantage of being a nice little performance
improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = dict()"
5000000 loops, best of 5: 69.8 nsec per loop
$ python3 -m timeit "x = {}"
20000000 loops, best of 5: 29.4 nsec per loop
Similar reasoning and performance results hold for replacing list()
with []
.
''' | ||
Parses the given test and test_type from the raw_file. | ||
''' | ||
results = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace dict()
with {}
(dict-literal
)
results = dict() | |
results = {} |
Explanation
The most concise and Pythonic way to create a dictionary is to use the{}
notation.
This fits in with the way we create dictionaries with items, saving a bit of
mental energy that might be taken up with thinking about two different ways of
creating dicts.
x = {"first": "thing"}
Doing things this way has the added advantage of being a nice little performance
improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = dict()"
5000000 loops, best of 5: 69.8 nsec per loop
$ python3 -m timeit "x = {}"
20000000 loops, best of 5: 29.4 nsec per loop
Similar reasoning and performance results hold for replacing list()
with []
.
continue | ||
if not is_warmup: | ||
if rawData is None: | ||
rawData = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace dict()
with {}
(dict-literal
)
rawData = dict() | |
rawData = {} |
Explanation
The most concise and Pythonic way to create a dictionary is to use the{}
notation.
This fits in with the way we create dictionaries with items, saving a bit of
mental energy that might be taken up with thinking about two different ways of
creating dicts.
x = {"first": "thing"}
Doing things this way has the added advantage of being a nice little performance
improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = dict()"
5000000 loops, best of 5: 69.8 nsec per loop
$ python3 -m timeit "x = {}"
20000000 loops, best of 5: 29.4 nsec per loop
Similar reasoning and performance results hold for replacing list()
with []
.
the parent process' memory from the child process | ||
''' | ||
if framework_test.name not in self.verify.keys(): | ||
self.verify[framework_test.name] = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace dict()
with {}
(dict-literal
)
self.verify[framework_test.name] = dict() | |
self.verify[framework_test.name] = {} |
Explanation
The most concise and Pythonic way to create a dictionary is to use the{}
notation.
This fits in with the way we create dictionaries with items, saving a bit of
mental energy that might be taken up with thinking about two different ways of
creating dicts.
x = {"first": "thing"}
Doing things this way has the added advantage of being a nice little performance
improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = dict()"
5000000 loops, best of 5: 69.8 nsec per loop
$ python3 -m timeit "x = {}"
20000000 loops, best of 5: 29.4 nsec per loop
Similar reasoning and performance results hold for replacing list()
with []
.
the parent process' memory from the child process | ||
''' | ||
if test_type not in self.rawData.keys(): | ||
self.rawData[test_type] = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Replace dict()
with {}
(dict-literal
)
self.rawData[test_type] = dict() | |
self.rawData[test_type] = {} |
Explanation
The most concise and Pythonic way to create a dictionary is to use the{}
notation.
This fits in with the way we create dictionaries with items, saving a bit of
mental energy that might be taken up with thinking about two different ways of
creating dicts.
x = {"first": "thing"}
Doing things this way has the added advantage of being a nice little performance
improvement.
Here are the timings before and after the change:
$ python3 -m timeit "x = dict()"
5000000 loops, best of 5: 69.8 nsec per loop
$ python3 -m timeit "x = {}"
20000000 loops, best of 5: 29.4 nsec per loop
Similar reasoning and performance results hold for replacing list()
with []
.
PR Code Suggestions ✨Explore these optional code suggestions:
|
PR Type
enhancement, configuration changes, documentation
Description
Results
,Metadata
,DockerHelper
,Scaffolding
,Benchmarker
,FortuneHTMLParser
,TimeLogger
,PopenTimeout
,FrameworkTest
,AbstractTestType
,TestType
classes, and various scripts for test execution and management.wrk
tool.Changes walkthrough 📝
16 files
wrk.dockerfile
Add Dockerfile for benchmarking with wrk tool
benchmarks/load-testing/wrk/wrk.dockerfile
wrk
.postgres.dockerfile
Add Dockerfile for PostgreSQL 17 database setup
infrastructure/docker/databases/postgres/postgres.dockerfile
mysql.dockerfile
Add Dockerfile for MySQL 9.0 database setup
infrastructure/docker/databases/mysql/mysql.dockerfile
mongodb.dockerfile
Add Dockerfile for MongoDB 7.0 database setup
infrastructure/docker/databases/mongodb/mongodb.dockerfile
custom_motd.sh
Add custom MOTD script for Vagrant setup
infrastructure/vagrant/custom_motd.sh
.siegerc
Add configuration file for Siege load testing tool
infrastructure/docker/databases/.siegerc
ci-pipeline.yml
Add GitHub Actions workflow for CI pipeline
.github/workflows/ci-pipeline.yml
60-postgresql-shm.conf
Add shared memory configuration for PostgreSQL
infrastructure/docker/databases/postgres/60-postgresql-shm.conf
shmmax
andshmall
kernel parameters.60-database-shm.conf
Add shared memory configuration for MySQL
infrastructure/docker/databases/mysql/60-database-shm.conf
shmmax
andshmall
kernel parameters.create.sql
Add MySQL database setup script for benchmarking
infrastructure/docker/databases/mysql/create.sql
Dockerfile
Create Dockerfile for Ubuntu-based benchmarking environment
infrastructure/docker/Dockerfile
my.cnf
Add MySQL configuration for Docker environment
infrastructure/docker/databases/mysql/my.cnf
postgresql.conf
Add PostgreSQL configuration for Docker environment
infrastructure/docker/databases/postgres/postgresql.conf
bw.service
Add systemd service file for BenchWeb
infrastructure/docker/services/bw.service
Vagrantfile
Add Vagrantfile for virtual machine configuration
infrastructure/vagrant/Vagrantfile
benchmark_config.json
Add template configuration for benchmark tests
benchmarks/pre-benchmarks/benchmark_config.json
46 files
results.py
Implement Results class for handling benchmark data
utils/results.py
Results
class for handling benchmark results.verifications.py
Add verification functions for benchmark test results
benchmarks/verifications.py
metadata.py
Implement Metadata class for managing test configurations
utils/metadata.py
Metadata
class for managing test metadata.docker_helper.py
Implement DockerHelper class for managing Docker operations
utils/docker_helper.py
DockerHelper
class for managing Docker operations.configurations.
scaffolding.py
Implement Scaffolding class for setting up new tests
utils/scaffolding.py
Scaffolding
class for setting up new benchmark tests.benchmarker.py
Implement Benchmarker class for executing benchmark tests
benchmarks/benchmarker.py
Benchmarker
class for running benchmark tests.operations.
fortune_html_parser.py
Implement FortuneHTMLParser for HTML response validation
benchmarks/fortune/fortune_html_parser.py
FortuneHTMLParser
class for parsing HTML responses.run-tests.py
Add script for running and configuring benchmark tests
scripts/run-tests.py
time_logger.py
Implement TimeLogger class for tracking execution times
utils/time_logger.py
TimeLogger
class for tracking execution times.popen.py
Implement PopenTimeout class for subprocess management
utils/popen.py
PopenTimeout
class for subprocess management withtimeout.
github_actions_diff.py
Add script for selective test execution in GitHub Actions
.github/github_actions/github_actions_diff.py
PR.
framework_test.py
Implement FrameworkTest class for managing framework tests
benchmarks/framework_test.py
FrameworkTest
class for managing framework tests.abstract_test_type.py
Define AbstractTestType class for test type management
benchmarks/abstract_test_type.py
AbstractTestType
class as a base for test types.fortune.py
Add TestType class for fortune test implementation
benchmarks/fortune/fortune.py
TestType
class for fortune tests.FortuneHTMLParser
for response parsing.benchmark_config.py
Create BenchmarkConfig class for configuration management
utils/benchmark_config.py
BenchmarkConfig
class for configuration management.abstract_database.py
Define AbstractDatabase class for database operations
infrastructure/docker/databases/abstract_database.py
AbstractDatabase
class for database operations.fail-detector.py
Develop fail detector script for framework failures
scripts/fail-detector.py
db.py
Add TestType class for database test implementation
benchmarks/db/db.py
TestType
class for database tests.output_helper.py
Introduce logging utilities for enhanced output management
utils/output_helper.py
QuietOutputStream
for conditional logging.postgres.py
Implement Database class for PostgreSQL operations
infrastructure/docker/databases/postgres/postgres.py
Database
class for PostgreSQL operations.mongodb.py
Implement Database class for MongoDB operations
infrastructure/docker/databases/mongodb/mongodb.py
Database
class for MongoDB operations.mysql.py
Implement Database class for MySQL operations
infrastructure/docker/databases/mysql/mysql.py
Database
class for MySQL operations.plaintext.py
Add TestType class for plaintext test implementation
benchmarks/plaintext/plaintext.py
TestType
class for plaintext tests.cached-query.py
Add TestType class for cached-query test implementation
benchmarks/cached-query/cached-query.py
TestType
class for cached-query tests.get_maintainers.py
Add script to fetch maintainers for affected frameworks
.github/github_actions/get_maintainers.py
query.py
Add TestType class for query test implementation
benchmarks/query/query.py
TestType
class for query tests.update.py
Add TestType class for update test implementation
benchmarks/update/update.py
TestType
class for update tests.json.py
Add TestType class for JSON test implementation
benchmarks/json/json.py
TestType
class for JSON tests.__init__.py
Add dynamic loading and validation for database modules
infrastructure/docker/databases/init.py
audit.py
Introduce Audit class for framework consistency checks
utils/audit.py
Audit
class for framework consistency checks.__init__.py
Add dynamic loading for benchmark test types
benchmarks/init.py
create.js
Add MongoDB initialization script for collections
infrastructure/docker/databases/mongodb/create.js
world
andfortune
collections with sample data.pipeline.sh
Add shell script for pipeline load testing with wrk
benchmarks/load-testing/wrk/pipeline.sh
wrk
tool for concurrency and latency measurements.concurrency.sh
Add shell script for concurrency load testing with wrk
benchmarks/load-testing/wrk/concurrency.sh
wrk
tool for concurrency and latency measurements.query.sh
Add shell script for query load testing with wrk
benchmarks/load-testing/wrk/query.sh
wrk
tool for concurrency and latency measurements.bw-startup.sh
Add startup script for BW services with Docker
infrastructure/docker/services/bw-startup.sh
bootstrap.sh
Add bootstrap script for Vagrant environment setup
infrastructure/vagrant/bootstrap.sh
bw-shutdown.sh
Add shutdown script for BW services with Docker cleanup
infrastructure/docker/services/bw-shutdown.sh
entry.sh
Add Docker entry script for BW container
infrastructure/docker/entry.sh
gosu
for user management.config.sh
Add configuration script for PostgreSQL setup
infrastructure/docker/databases/postgres/config.sh
core.rb
Add Vagrant core configuration for providers and provisioning
infrastructure/vagrant/core.rb
pipeline.lua
Add Lua script for pipeline request generation in wrk
benchmarks/load-testing/wrk/pipeline.lua
wrk
.create-postgres.sql
Add SQL script for PostgreSQL database initialization
infrastructure/docker/databases/postgres/create-postgres.sql
World
andFortune
tables with sample data.label-failing-pr.yml
Add workflow to label failing pull requests
.github/workflows/label-failing-pr.yml
ping-maintainers.yml
Add workflow to notify maintainers on completion
.github/workflows/ping-maintainers.yml
get-maintainers.yml
Add workflow to fetch maintainers for pull requests
.github/workflows/get-maintainers.yml
2 files
README.md
Add Vagrant setup guide for development environment
infrastructure/vagrant/README.md
using Vagrant.
VirtualBox development environment.
README.md
Add setup instructions for new benchmark tests
benchmarks/pre-benchmarks/README.md
applications.